PSA: Secure your build processes

I need to say this because there’s too much moaning and grinding of teeth going on about npm packages loads of projects depend on.

If you have a project with dependencies, do yourself a favor and have an in-house mirror for those. It’s even more important if you’re a software shop which works primarily with one technology, which I presume is a very common case.

I’m not too node.js savvy, but in the Java and Maven repositories world, we cover our backs using either Nexus Repository (which appears to work with npm, too) or Apache Archiva.

That way, when we “clean install” the last checked in code for final delivery into the QA and deployment teams, we don’t run into crazy issues like having it not build because someone decided to take their code down – or had it taken down by force.

In a Netflix chaos monkey-like approach, try to foresee and forestall all causes for unreliability at go time, not only with this but any other kind of externalized source of services. You, your family, significant others, pillow, boss, co-workers and customers will all be happier for it.

Use LetsEncrypt

I’ve successfully updated my SSL certificate for this website, and automated the transaction as a result of the first repetition of the maneuver.

It’s a really nice way to keep your site secure, and it pushes you towards automating the renewals by having a relatively short certificate life span. Plus, it’s free.

Receiving the alert is very refreshing; I still had 19 days to renew my certificate, which would’ve given me plenty of chance to do it even if I’d not had a shellscript handy waiting only for a good chance to test it and schedule.

Keeping your client-server communications secure is a must-do; even if most of what I write here will eventually see the light of the day, much damage can be done to my image if the site was compromised; and this is just a personal website. If you have a website where clients log in and trust you their data, and for some reason you do not have secure connections enabled, do yourself a favor and fix that problem.

Thoughts About History

History is a subject that usually leaves me dissatisfied. It may be that I have the wrong approach in the way I think about it, but it has consistently left me feeling uncertain over the years.

We learn history from many sources; oral stories told by our family, which usually cover anecdotes and interesting tidbits; written texts by historians out there; the news of the day and from other times; from textbooks.

Now, oral stories are notoriously unreliable. I know, because I’ve seen the deformation of anecdotes firsthand during my lifetime – which is still on the short end of the scale. Stories about grandparents and further back in time… I can only expect they retain no more than a passing resemblance to what was going on.

The books by historians are in some ways similar to the news: they go through a publisher’s hands, they are subject to all kinds of pressures and interests. The winners write history.

Textbooks, at least in my country, are increasingly regulated. In public education they are literally handpicked. This kind of history has the strongest, most viable path to being censored/edited by an interested party, because there’s a single bottleneck in an office in a government building.

How can we ever be certain of what happened? All the time I’m uncovering facts that contradict my earlier knowledge in ways so blatant that it allows me to see that it’s not a model I’ve built: it’s a model I’ve been handed, and has been socially validated, and may or may not have anything to do with reality or what someone wants me to think about myself and my environment.

Many aspects of history are subtly manipulated in ways I’ve learned to identify over time, and which make me react intensely.

Attributing intent to people is one of those; defending actions in hindsight is another. It makes me want to see proof – that a certain intent was there, that a certain datum was there, that people demonstrated thinking with a certain pattern or using certain tools. But when I think about what kind of proof that would require, it is then that I feel helpless. See, because of my (hopefully healthy) dose of skepticism, I understand that I shouldn’t treat much of history as more than fables and fiction.

On the other hand, the effects of history are real. The effects of perceived history are just as real, although maybe not as intense. I think, then, that there is use for understanding what the world thinks of its own history, because that allows us to have a working model, a framework from which to work and communicate.

But we should be careful of the way we extrapolate, the way we apply the model to our current situations. Historic data we use as input for the way we think must be tested and considered “possibly wrong”, and the truthfulness we assign to it part of the model we’re working with.

So, I’m skeptic and mighty and have an unbreakable vow not to trust history? Not quite. I’m gullible with historical information – we all are, as humans are attuned to stories in their patterns of thinking and remembering. But I do take care when I have the opportunity to make a decision based on the past. Even for events in which I’ve been involved I try to get other versions, other sides to the story, to have better probability of understanding what truly was going on. On more than one occasion I’ve been surprised.

I acknowledge that this position is not very elegant; it imposes a huge burden upon the people looking back and looking forward, trying to make good choices. It seems questions the validity of basically everything we think we know about the past, although it actually only questions the accuracy of most of what we assume we know from the data we see… yes, not much better.

Nonetheless, I’d like to be proven wrong time and again. The way that’d work is by having a decision being taken on account of a model based in the understanding of what happened a some time ago, the further ago the better because accuracy dies over time. And have the decision work for reasons consistent with the model. I’ve seen this on many decisions taken from personal experience in management, software development, teaching, which tells me that many people really understand what they’re about in their daily work. It may be hard to set up a large scale experiment, but in the absence of data to validate our beliefs, we should acknowledge that lack instead of just defaulting to the most comfortable side

Notes on Java 8

Enum types are constructs which represent sets of known values. They are useful, but are something of a kludge in a way that reminds me of the String class shenanigans.

Enums types are declared like other composite types, like class or interface. They have their own keyword. Unsurprisingly, it is enum.

Now the use for them is clear: it avoids the need of having a load of “public static final” fields laying around, and gives them common functionality, like a “static T[] values()” which returns all the possible values, or a static “T valueOf(String)” which returns the value with a name matching the String parameter.

They can be used to build finite state machines, and can be used with the “switch” construct. They help avoid silly-in-hindsight but maybe-really-serious bugs by catching typos: all values must be declared, if a match was made with a String literal a typo would create a branch of never-used-code.

Another neat feat is that they can return their name as a String, and can be compared to each other – the order in which they’re declared determines which value is “first” and which is “last”.

Now, enums are actually classes with shenanigans added. Even though it’s never specified, an enum type is a class which extends from java.lang.Enum<E> and has some methods (which I suspect are injected at compile time, it would be nice to confirm by looking inside a .class file; for the record, clever use of reflections would make creating a generic method which is invoked for the special, shared functionality rather trivial)

As it’s a class, you can write other methods and declare variables in it.

But you can’t build an enum by hand – extending java.lang.Enum<E> is illegal. Even though it’s not final. Which is an annoying inconsistency and lack of elegance. Was it necessary to implement like this? It’s very likely, as there are many really smart people working on the Java language. It’s not pleasant, though.

Enums are treated like special citizens, and even have particular data structures and algorithms tuned to them (EnumSet and EnumMap); which further reminds me of the shenanigans that go on with the String class, what with special in-memory representation and all.

This are not bad things in and of themselves – the way that Enums and Strings break out of the pattern of the language; they don’t fit with the mental model that would arise out of studying the rest of the architecture, though, so care needs to be taken, that they won’t come back to bite you.

Low Energy

Perhaps like Bilbo missing his hat – not for the last time – I wonder now whether it was a good idea to commit to a full year of daily writing.

Although most of the content comes from personal experience, meaning I mostly consult sources to double check as opposed to for original research, it’s a heavy drain on the time and energy I have available.

This poses a serious threat to my desire of sustained output (in writing) and increased output (in code), and I need to deal with it.

Playing around with my sleep schedule may help, as could also playing around with my eating habits. The former is a bit hard to achieve due to a fixed daily work schedule and traffic.

In any case, I will run some experiments – I’ll try having a bigger breakfast, for instance, and see how it goes.

Today’s text was supposed to deal with Java enums, but my energy was too low to do it. I increased my at-home work-to-leisure ratio, which accounts for this drastic drop, although I’ve been tolerating a mild beating in the last few days. Tomorrow will be a new day, I suppose, and I hope to cover the intended topic.

Thanks for reading.

Argument for learning to develop with emacs

You may have read my thoughts on why people should learn to code with a good CLI environment at hand.

I’ll one-up myself – it’s good to have emacs at hand.

You have all the features of the shell – well, you have the shell itself, if you want to. But you have even more extensibility, because you have other aspects to work with than apps: you have text editing as a programmable activity.

So besides having and being able to easily create tools to work on data at one level (files), you have and can easily create tools to work on data at another level: content.

So, emacs is not necessarily the only environment which one-ups a good CLI, but it’s the one I know and I can get behind.

What’s so special having the content editing be programmable is a layer of power further from tweaking your environment and adding tools to your toolbox. This is like tweaking the toolbox, so that you can have some tools work better.

The effect is so strong that people who work with emacs often find themselves more and more working from within emacs. So you don’t have an environment from which you call the editor to do some work with. You have an editor from which you work, and as code is text, you have a powerful tool in which to build powerful tools to better build powerful tools.

This is highly desirable.

An argument for learning to code with CLI environments

There’s a difference between working in the command line and working in a desktop environment which I think is crucial to software developers’ formation.

When you’re working on a desktop environment, there’s a particular flow that shapes the way you interact with your programs and files: you’re working on a physical space, and to use your tools, you go to them, then get your data into them, use them, then get your data out, back into your desktop. For instance, if you want to take some data from a text document and perform numerical analysis on it, you’d go about it thus:

  • Go to the place where you launch programs from, and open an editor
    • browse for the document with the data
    • get the data out, presumably by copying it into your system’s analogy of a clipboard
  • Go to the place where you launch programs from, and open a spreadsheet program
    • paste the data
    • use the spreadsheet on it
    • store the resulting document where you want to do it

The important part here is that you go to the tool then get the data in the tool.

In the command line, it would be a little different.

You would point the relevant tool at the document, extract whatever data you need, and point another tool at the resulting documents if you still need it. It’s the difference between:

  • Navigate “Start menu -> programs -> accessories -> Notepad”
  • Navigate the menu “File -> Open”
  • browse for the document

And

  • Type “editor name_of_the_document” or “editor /path/to/the/document”, or in the worst case scenario, “/path/to/editor –option-or-options /path/to/document”

There are some workarounds to this situation, like having default applications being able to open a file, or right-click and selct “Open With…”; but those actions feel very ad-hoc.

Thus the command line feels more like you’re reaching for a tool and using it where you need it.

A consequence of this is that when you write an application for use in the command line environment, no matter how simple it is, it feels as if you’re changing your operating system to suit your needs. When you develop simple desktop applications, it feels as if you’re building environments to put your data into.

From this peculiarity of each environment stem patterns of use. Tool chaining and piping feel like “reach for this, apply on file/data, then reach for that and apply it on the results”, which helps you think about the whole process seamlessly. Building tools to put into a chain means that they could do only very simple things, and still be very useful. Thus are early developers encouraged.

Customizing the whole system… is very encouraging, of course, and having “commands” or “actions” at your disposal which you’ve built yourself, as opposed to having “containers” into which to put the data for work, is very empowering.

There may be some way of chaining programs in a desktop environment, or creating simple software in such a way that it feels like you’re changing the whole system. I have not encountered either.

The most powerful feelings around desktop environments I’ve done stem from manipulating the PATH variable and having some daemons doing fun stuff around the UI, the kind of stuff that would give you pause and wonder whether my computer’s behaving oddly or if you’re imagining things. Doing this in a command line environment is par for the course, but feels no less empowering.

I am not talking from nostalgia here – grew up using GUIs, not CLIs. Indeed, even though my first programs were CLI-based, as I was using them from a GUI, the feelings of changing my environment to suit my needs didn’t come along until I’d started deploying software in a Solaris environment and started automating things.

When I discovered the feeling of changing the PS1 variable, customizing the .bashrc or .profile, creating aliases… it was then that I started being prolific in my CLI-program writing for practical purposes. It seeped into my other operating systems / work environments; I routinely create a directory for shortcuts to the programs I run, I tend to run programs with the “Run” option of the OS I’m using (Win+R in widnows, Win + program name in Ubuntu, Ctrl+Alt+T and run the program from the shell in most other linuxes, Command+Spacebar + program name in MacOS), write UI gadgets or other empowering shenanigans.

I think that much of this would’ve started earlier if I’d had a powerful command line as a primary tool for using the computer at the time I started learning how to code. I don’t mean to disparage GUIs; I just think the kind of feeling delivered by a good CLI and the patterns of use they encourage can be of great importance when starting to code, enough to make it at least a great complement to GUI-based computer usage and software development practices at the time of learning.

Brief comment AlphaGo’s victories

AlphaGo won 8 straight matches with two of the top players of one of the most computationally-complex perfect-information games out there: Go.

Its complexity stems from the sheer amount of possible scenarios that can play out in the game.

I thought that AlphaGo wasn’t really a breakthrough from usual playing techniques, although I failed to say so in public – Maybe I should cultivate the habits of predicting things publicly.

In any case, there are the advantages I think AlphaGo has over other Go AIs:

  • More computing resources (memory, processing power)
  • Access to better players

The first one is obvious. The second one, maybe not.

So here’s my take on the workings of AlphaGo: It has a combination of the Monte Carlo Tree Search algorithm and a Neural Network or other kind of pattern matching mechanism.

The pattern-matching mechanism, particularly if it is a Neural Network, would benefit from playing a lot of games; it could learn to prefer analyzing a particular “branch” or sequence of movements – we could say, roughly, pursue a train of thought – in a way that makes it likelier for it to win.

If a pattern matching algorithm plays only poor players, it will learn to beat them, but it won’t know how to beat good players. If it plays only a certain kind of game – say, the opponent plays always in a similar pattern – then it has gaps in the game, complete situations it can’t have a good heuristic.

Playing good players means that the tool explores many of the best techniques and possibilities frequented by good players, thus become better at choosing how to play against good players.

Now, you may think “well, it won’t be able to beat poor players, then, right?” But you’d be wrong. Because playing actually implies thinking about many turns in the future, and patterns for winning are already there, and there are techniques to secure your position… well, a poor player won’t be able to foresee much, which a good AI would, and even if the AI couldn’t bias itself towards the smartest plays, it can choose decent ones, which would be enough to beat a poor player.

In essence, I think AlphaGo has two mechanisms, then: one which has a bias towards immediate good-looking plays, and another one which has a bias towards statistically good-looking plays over many games.

I didn’t think all of that before the fourth game of AlphaGo vs Lee Sedol; I just had a hunch that you wouldn’t actually need a revolutionary AI to beat people at go. I may still be proven wrong if and when a paper about AlphaGo describes its inner workings.

What makes me feel more certain about it is this:

Mistake was on move 79, but only came to that realisation on around move 87

Demis Hassabis, CEO of DeepMind

And he had access to the data.

After move 87 or so, AlphaGo went haywire.

While about to post this, I found that the tweets by Demis Hassabis confirm my suspicions.

The neural nets were trained through self-play so there will be gaps in their knowledge, which is why we are here: to test AlphaGo the limit

So, while it’s amazing to see that a computer may outperform a person in a well-specified, perfect information game… I think it is at an advantage! Because Lee Sedol’s mind wasn’t trained on playing computers, but on people. I think that by exposing himself to a dozen or so further games with AlphaGo, Lee Sedol could start routinely beating it.

This reminds me of the image recognition neural nets which mistake static-like photos with all kinds of animals. You can find a set of images in which humans will routinely outperform them. As Go is a game where the “picture” is the result of your and your opponent’s plays, you can routinely set up the images.

Let’s see what happens in today’s games, anyway. Fun times to be alive in.

There is some good in being ever curious

Or: a healthy dose of skepticism should be accompanied by a healthy dose of curoisity. And, in my experience, people are seldom curious enough, so raising the bar is hardly a danger.

I don’t seem to have mentioned the 12 virtues of rationality before in this blog, but I have talked about rationality under another, hopefully less loaded, name: cognitive calibration.

This is an invitation to be curious about people around you and the decision they make, specially if they impact you. Why does a teammate want to go in a certain direction? Why does a colleague think our choice is poor (or stupid, depending on their niceness)? Why, what, who, when, where, how?

Cognitive calibration is all about having a better model of reality – the better calibrated our cognition is, by definition, the better our capacity to predict the future and understand the present.

One of the tools that can make a model give more accurate predictions is the abundance of data. Thus, always wonder, always question, always try to find out. If your model is wrong, or the way you’re getting data is biased, then your understanding of the present and future will be ever further from the truth.

In order to correct for this: wonder. Wonder whether you’re truly in the right, whether others are truly in the wrong. Look for the answers.

If reality doesn’t fit your predictions or understanding: wonder. Wonder about how stuff really works, about what’s really going on, about where you went wrong.

I’ve you’ve not changed your mind or don’t like questioning a particular subject: wonder. About when and why you became so protective, about how to overcome that protectiveness in order to better understand yourself and be able to wonder, once again, wonder if you are right – or how you can be right.

Curiosity seeks to anihilate itself, writes Eliezer Yudkowsky.

That’s true – but it’s like a phoenix in that it can be reborn time and again. For as long as I’ve been looking for answers, I’ve always found more questions. I think we’re far from the place where the amount of questions are going to start converging and reducing, the place where we know it all.

So please your curiosity and crash it into the facts, that you may be wiser, and curiouser.

Try not to get into trouble, but always bear in mind… how much trouble are you in by not knowing now, not being able to see, even in a blurry way, tomorrow?

Notes on Java 7

Remember when we talked about the two kind of relationship between concepts (aka blueprints, aka classes) in Java? One interface-based and another class-based? (Please, bear in mind that interfaces are blueprints-of-blueprints, which means they’re blueprints, too, just a special kind).

In any case, interface based relationships are peculiar. You may remember the notes on Java’s ergonomy and its relationship with existing interaces. If you don’t skim Notes on Java 6 to get a sense of where I’m going.

You may notice that some interfaces have a <T> up there? Let’s talk about that.

Interface based relationship is also called contract based relationship, because the interfaces basically state things that are guaranteed to be doable.

The <T> is another type of contract, this time guaranteeing not operations, but types. The “T” represents a type. Usage of this is called Generics programming.

When you declare a class to include the generic type <T>, you can then refer to it inside, to receive parameters of the same type T. This gets substituted by whatever is used by the programmer, lending type safety.

If you have a class

class A<T>{
    private T parameter;
    public void setParameter(T param) {
        this.parameter = param;
    }
    public T getParameter(){
        return this.parameter;
    }
}

and use it like this:

A<String> a = new A<>(); // before Java 7 you had to type <String> twice.
a.setParameter("someString");

all will be well, but if you do this:

A a = new A<String>();
a.setParameter(25);

it will fail at compile time.

Which is good. The possibilities are many, but this ensures that a particular composite type is coherent.

A special case for generics is when you want to make sure you are working with a type that has a particular relationship to another, known type, be it above (general) in the hierarchy or below (specialized).

For these cases, the special, wildcard argument “?” is used, like this:

public class A<? extends Type>{}

or

public class B<? super Type>{}

as the case merits.