Brief Notes on Data Types

I’ve mentioned data types a couple of times by now, and will probably mention them again. Thinking about them is interesting because it’s a window, albeit small, into the way programming languages work. Here are some notes on the nature of data types I’ve thought about, and an invitation to think further.

Data

Data is the plural of datum, and datum is a unit of information. All sensory input we receive is data, and it’s stored in our brains in the form of neural connections. We can pass it along in the form of sound waves (talking), symbols (writing), and generally through the encoding in a medium using a predefined system of symbols. In the case of computers, we store the data in the form of arranged matter, usually moved through magnetized equipment or electrical current in patterns of two states. Thus, binary (encoded) data. Also called digital information, although I find the name too generic (digital information can cover any data encoded using a discrete set values, as opposed to analog data, so while binary data is digital, not al digital data is binary).

Processors

Processors are collections of electronic components arranged in a way that they react transforming collections of electric signals predictably. This is one of the main components of modern computer architectures. When you hear about an 8-, 16-, 32- or 64-bit computer, it usually refers to the size of the collection of signals that the processor handles: a wire (or binary datum) is a bit: a certain voltage represents the a 0, and another one represents a 1. Those digits can be interpreted as part of an n-bit number. They can also be interpreted as instructions. Instructions are numbers are configurations of physical media are any kind of data, because the only medium to express any and all data in a computer is through discrete digits. Understanding this is central to understanding data types.

Data Types

Data types are what happen when you decide to treat sequences of data in a special way. In other words, when you make certain patterns’ meaning and your actions on them dependent on other, predetermined patterns. You need, first, to know what transformations the processor should do on the data (sequences of instructions to be applied on the data) if the data has a particular type. If we didn’t have any programming languages, we’d do this by hand – as instructions are numbers are configurations of physical media, we could do this by setting up a particular medium arranged with data that the processor interprets the way we want. This is what a computer BIOS is, by the way. So, we’d know what data we put where, and what we want to treat it like. If it’s letters, we can make capitalize them or lower their case. If it’s numbers, we can add or subtract them.

We assign arbitrary values to particular arrangements of data – thus the integer number 64 is a sequence 01000000 in the medium is an ‘@’ if interpreted in the context of ASCII encoding. Agreeing on what a particular string of symbols mean is the root of many, many issues in modern computing (read about: ISO-8859-1, ASCII,  Unicode). In any case, this interpretation and the set of operations we define on a particular datum is what makes a data type.

Programming Languages & Data Types

As we do have programming languages, the work of determining what to do with a particular datum in a particular moment, what is allowed and what is not, is taken from our hands. All languages enforce rules on what can be done with a particular datum; for instance: what datum can be transformed into which other one and how to go about doing it, or what happens if a datum that was stored as a number needs to be added to a datum that was stored as a letter.

The C programming language offers a very close experience to what you’d see working on assembly language, as its restrictions on what to do with data are very low. Want to add (or subtract, or compare) a number and a character? Go ahead. The operations are executed on the numeric values represented by the data.

Other languages, like Javascript, collapse in a particular way. Want to add a number and a character? Your number is now a character, and you get a collection of characters (i.e.: 1+’a’ = ‘1a’); Want to multiply them? You get NaN (Not a Number).

Some languages prevent you from doing this, and warn you that you’re trying to execute some ambiguous operation, like Python. (It is surprising that 2*’a’ = ‘aa’, but that’s a solid, unambiguous way to interpret it. 2+’a’, though… is the answer ‘2a’ or 99? Python asks you to transform the character into numbers, or viceversa, and refuses to guess).

Epilogue

Some data types are not represented in such a straightforward way as ASCII characters or integer numbers (see: Floating Point numbers). Other data types are built from the group of types the language already knows how to work with; those are called composite, or compound, data types. Perhaps future musings will cover these ones.

Tomorrow, we’re going back to the Java Notes.

Notes on Java 1

Java is an Object Oriented programming language. In this series of posts, we’ll explore different aspects of it: data types, target machine(s), object model, and other details of interest.

Sadly, even though I’d planned to make this in a certain order, and due to my goal of writing a daily post, I’ll not start with the story of the language, but with the primitive data types.

The primitive data types of a language are those which aren’t composed of any other data types. If you’ve never worked with a low level language (and, of course, if you’re new to programming) you may be unaware of certain distinguishing features of the data types: size, operations permitted, the way they’re interpreted, and relation between it and the other types.

In Java, we have 8 primitive data types, which can be roughly divided in categories:

Integer types

  • byte, which is 8-bits in size, signed (two’s complement negative); it’s an integer numeric type at its core.
  • short, which is just like byte, except 16-bits in size
  • int, same old, but 32-bits
  • long, you guessed it, but 64-bits

Floating Point types (which are imprecise and comply with the IEEE 754 floating point spec; wikipedia explanation here)

  • float, which is 32-bits
  • double, which is 64 bits

Units of Data

  • boolean, which is true or false and with undefined size.
  • char, which is 16-bits and represents an Unicode character.

Those types are, along with the Object system, the backbone of Java and its architecture.

Speaking about the object system, the String object is… special. It can be created with quotes (“This is a String”), thus not requiring the paraphernalia that object instantiation needs in Java. I’ll cover the String in a future article.

There’s something to notice: numeric data can be interpreted as another type (int as byte, long as short, float as int…) if going from a smaller type to a larger one, this can happen implicitly, as well as when going from integer types to floating point types.

Going in the other direction implies that data can be lost – it may not fit! So the transformation must be explicit, to acknowledge that we know what’s going on.

We’ll talk more about this and other subjects soon.

Improving Practices

A while ago I started looking into PDCA, and noticed they play nicely with the self improvement and deliberate practice research. They’re worth looking into.

I tried formulating a process which took itself into account, and while a formal specification proved beyond the time and focus I could give it at the time, I came away with an intuitive grasp that helped me apply continuous improvement to my efforts to become better all the time.

This means that my improvement in any particular area is bound to accelerate for a while. It’s hard to account for increase in the rate of improvement – is it because of the process or would you normally increase your rate of growth in that particular discipline?. But at least I get a sense of happiness which helps me practice deliberately and get better, so the mood boost is a good boon.

In loose terms, this is what should be going on:

  • When you work on something, notice what you do, and measure whatever performance indicator you can
  • Meditate on how you can improve the performance on the indicators
  • Execute on the improvements, while measuring
  • If you improved on your indicator, keep your change; otherwise discard it.

Now, verify the ways you’re measuring, the ways you’re meditating – in short, think about how you’re checking your progress.

In diverse areas, I’ve come up with different ways to improve my process; from a Nick Winter inspired automation of certain measurements through my text editor to discussing the changes with someone instead of just meditating on my own, I feel more thrilled to work, and work on my work.

This way, I don’t only improve through practice, but I improve practice itself.

Working vs Planning to Work

There’s a relationship between doing something and thinking (planning, or dreaming) about doing something.

The thinking, dreaming or planning give a certain satisfaction, as if one were doing what needs to be done – or more precisely, achieving what needs to be achieved.

Working is not that gratifying, as results aren’t immediate.

There should be a balance between the two activities – planning and dreaming fuels your mental state and gets you in the mood for the actual work, and working gets things done and gives you real gratification.

When you’re planning to work, make sure you set time apart to actually do what you need to do. When you’re overwhelmed and don’t feel like working, plan or daydream with the achievement.

PyCaribbean Retrospective

I went to PyCaribbean this past weekend. It was a marvelous experience.

I’m told not all conventions are like this – so I’m inclined to share my perspective on this one and congratulate the CodeTiger crew for an excellent event.

The talk lineup was great, the presenters knowledgeable and from all walks of python development. The food was amazing, too, as well as the setting.

So, here are some things I’m happy about:

  • Leonardo Jimenez’s idea to kick off the event with a instrumental tropical music group was amazing. The song selection was pretty good, and the performace really tasteful. I’ll forever think that Happy should include an interlude in Bachata style.
  • Brandon Rhode’s keynote was beautiful. I loved the way he seemed to ramble for a bit, just to tie the thread back into the whole opus. Yes, Python is born of thoughtful syncretism, just like natural languages, just like cultures.
  • The talks were plenty – four tracks, three in English, one in Spanish. We were tight for time once or twice, but the chatting on the breaks was blissful. No posturing, no bragging – just people talking about their awesome work, discussing the insights shared in the talks.

Alas, I couldn’t go to all the talks. Particularly, I’m eager to see the talks uploaded to drink Andrew Kuchling’s “Computer Recreations or, Rediscovering the 80s for Programming Fun” and Georgia Reh’s “How to Teach Git”.

There were some talks which resonated with me and follow a thread:

  • Allen Downey’s application of Python features to semantically map a domain into the language was very spot on. Reminded me of Lisp DSLs, which was the whole point of it. Exploring a domain through mapping of concepts to operators is a great way to create a framework for expressive exploration of our mental model. Going from thought to executable model? Bravo!
  • Geoff Gerriets’ insight about the similarity between coding and writing is flawless: we become better through revision, and our code becomes better as we revise it. Programming, just like writing, is not merely typing away – but also chipping away at what we already typed, reforming concepts and pacing and merging or separating concerns. I find this plays well with Allen’s talk.
  • Thomas Ballinger’s “REPL-Driven Development” was all I was hoping for. And believe me, after seeing what’s possible with Lisp, I was eager to see what we can do in Python. This is a perfect conclusion to the “I’m coding smartly, in a way similar to how I would express myself, and I want an intuitive grasp of the language I’m using to express myself in.”

They’ve inspired me in so many ways – the analogies, the topics, the fluidity and segues… I’ll use that to fuel and improve (revise!) the talks I’ll share in the communities.

As a friend of mine would say “10/10 would play again.”

Interesting Things I can’t pay attention to right now

A side effect of trying to focus a lot, specially when you have a habit of exposing yourself to new information while on break, is that you get many serendipitous moments you can’t chase.

This is painful, so I tend to keep the tabs open in the browser for a while, maybe even advance in the reading, but quit eventually. It’s not really easy to keep up on a subject on a nibbling basis, when you don’t have time to digest the content because you’re focused in the now.

So, here’s a list – which is short right now, but a living document, and I hope you’ll find something interesting to read about (and then let me know your conclusions, hopefully).

Interesting Things I can’t pay attention to right now.

Some notes on Spring

Spring used to be confusing for me, largely due to the form of the comprehensive online documentation. The main issue with the docks is that the aim of the framework is not explained succinctly, and without intent it’s very hard to understand what’s going on with basically any piece of code.

Spring in Action gave me a powerful view of many of the frameworks in the collection of Spring frameworks, and drove the message home:

Spring was built to make Java Enterprise development easier.

With that in mind, I started seeing the pattern everywhere; it’s a very thoughtful collection of tools which I could grasp because, basically, I’d thought about implementing many of the different features for my projects at one point or another… even if joyously differently from what I’d had in mind.

Some of the key features in Spring:

  • Dependency Injection is handled by a container – which is not a standard JEE container. This one had me for a while, as I groped for a way to get ahold of the application container. Developing your first Spring app for hosting in Weblogic (or any other application container) is bound to be confusing. Try the self-hosted approach first.
  • Aspects! And implementation of many features with them. Aspects are mechanisms which allow you to write the code somewhere and have it splash everywhere – the sort of thing that would help you reduce the code for security, session management or logging. Some Spring features are implemented this way.
  • Templates – another way to reduce boilerplate. Just hand over your tired, boilerplate ridden code and watch it be reduced to an expression of what you mean to do. It’s not UI templates or code templates like you’d use to stub a project. It’s behavior template, like the boilerplate we use all the time when working with JDBC.

One piece of Spring I found a massive hindrance to thorough learning is the pervasiveness of Spring Boot usage in the documentation. Having so many things be automatic is great for those who know what they’re about, but not so good for learning. I found Spring Boot akin to an IDE, and as now all the documentation refers to it… well, it’s not easy to piece everything together from the ground up anymore. All in all, a good tool – maybe I’ll write some small docs to help get started with it.

Knowing Yourself (noticing mental illness)

Recently I’ve learned a big deal about mental health. One of the facts that caught my attention is that people don’t usually seek help, and often can’t tell that they’re going through some suboptimal situation they could work on.

Stigma (fear of finding out you’re mentally ill), cycles in the symptoms, and the difficulty of knowing that your mind is behaving oddly (because you think with your mind) may all play a role in this. When I encounter a difficulty, I like to tool it away. Thus, a log of how we’re feeling may help detect when we’re feeling differently or showing a different pattern. Because we’re using an external tool during a long amount of time, it is easier to see if the patterns change, because we’re dispensing with our momentary subjective point of view to analyze our long term subjective time-lapses.

This goes in line with the initiative of the guys at Quantified Mind, and the Quantified Self movement. Besides keeping a dairy (narrating what happened during the day) and a log (timestamped list of things you do) to be able to find any anomalies, it’s useful to evaluate yourself in from 1 to 10 in Energy and Mood, and taking some of the tests from Quantified Mind should be helpful enough. Setting up experiments with them can help you see if anything you change in your habits has significant impacts. Periodically looking through your tendencies, log and dairy to notice correlations on your feelings and other variables can help you detect cycles.

If you feel like things are going wrong even though things may look to be alright, or if you notice sharp cycles in your feelings, it could be important to seek help. Please, do not be blindsided by an illness which can be controlled, hopefully kept out of your normal life.

Bringing People Down (is usually not nice)

But sometimes necessary.

Actions have consequences, as far as we can tell. We can, then evaluate someone’s impact on a certain environment by tracing the causality from effect to cause.

In any case, it usually happens that environments which involve people are really complex, so much so that I find them scary. Our own actions are likely to have unintended consequences, thus we can hardly be certain of other people’s action’s consequences. Thus, our responses should be commensurate with our certainty, and our certainty calibrated as properly as possible.

Sometimes we feel sure that we need to stop someone from doing something, because we observe and feel sure that their actions are particularly pernicious. Sadly, because of the complexity of most situations, I always feel that maybe there is another way, even when I can’t find it. Life has a sequence of compromises, tradeoffs and decisions, though, so some things must be done even if they make us feel down.

Some people take actions with consequences so pernicious that we may feel validated in stopping them. Somehow, though, it never feels nice.

Prisoner’s Dilemma as a proxy for trust

The Prisoner’s Dilemma is a simple game which illustrates a very interesting type of situation we can find ourselves in any day. It goes like this:

Two members of a gang are imprisoned. Each one is in isolation from the other, with no means of communication. Each is offered a deal:

  • If only one of them confesses, they go free while the other does three years
  • If both confess, they do two years each
  • If neither confess, they do a year each

The main point is that if you confess, you may go free, but cost the group three years. If both confess, neither go free, and it costs the group four years. If neither talk, the group loses only two years, although it is certain that both will go to jail.

Daily life has many such situations in which someone can win personally while making the group lose; crooked politicians or police officers; salespeople; employers and employees. We live in a world of conflicting incentives.

Anecdotally (per a Buzzfeed video) People are less likely, when playing the game, to ‘fess up against someone they know, but more likely to do it when set up with someone they don’t know. This makes sense, as someone you know has more weight for you emotionally than someone you don’t.

I have found a good proxy to how much I could trust someone in a certain situation is to think about how they’d act if we were playing the game. Luckily, often in life we have the chance to communicate with the other people involved in our dilemmas; I’ve not yet had the presence of mind to try and work against a forecasted negative in a prisoner’s dilemma-like situation, but it may be worth a try.

In any case, this simple excercise and the observations on people’s behavior regarding strangers is one of the factors that make me think that maybe we shouldn’t trust strangers with choosing for us the way it often happens in representative democracies… because even if we may feel invested in them from watching them on TV or reading about them in the newspapers, odds are they don’t know us, thus can’t empathize, thus don’t have a strong incentive to fight the other strong incentives from being in a position of power – namely, benefit themselves and people they do feel strongly about.