Tips to Cement your Knowledge

There’s a particular joy in learning something (a programming language, how to use a tool, an algorithm), then perfecting it through practice and teaching.

Today I’ll talk of two ways to do both things while building your programmer cred and helping the community. Also, it’s very likely that you’ll hone additional skills while engaging in these activities.


Make sure you learn your tool, task, or algorithm properly. Understand the tradeoffs and caveats. Get it to break a couple of times in different ways, so you get a feel for failure modes. Once you find yourself feeling confident enough, carry on.

Method #1: Hack on random people’s projects

GitHub is a great place to be at. If you find another place like it, please let me know. The main feature here is discoverability. Some months ago I was preparing a talk about a particular module of python’s standard library, argparse.

My main use case for python is writing tools for myself, so I use argparse often. But to make sure I didn’t have the wrong end of the stick, I went into GitHub and looked for python projects which may need argparse. How? I searched for “argparse”, and filtered by issues.

Like this:

github argparse issue




You can further filter for the open ones, which is what you’d want for the excercise. Which is get into a project, read up on it, clone it, add argparse support, and open a pull request. Rinse and repeat a couple of times.

Then follow through: you’ll probably be asked to make some changes, maybe document the usage (if you changed the way the tool is used), maybe write some tests. Ideally, you’ll get the hang of it and start looking for the right things when trying to contribute to more projects, thus needing a fewer interactions for the one feature you learned how to implement to get into the projects which need it.

This helps you grow as a software developer and as a communicator, thus, as a contributor to open source projects. Besides honing your ability to implement whatever you chose to learn.

Method #2: Get involved personally

Find a local user group, and prepare a talk on whatever it is you learned. Teach it to people personally. If not at the local user group, then maybe at school, or at work. Pass the knowledge on, and make sure to note any and all questions that come up, especially if you don’t know the answers right away.

This will help you cement your knowledge, and hopefully earn you a reputation for knowing, teaching, and being cool in general. If there are no local user groups, or for some reason you can’t get involved as much as you want to, consider writing on the subject.

Hopefully someone will stumble upon it when they need it, and that’s more than you bargained for when just learning. Additionally, this kind of public writing gives you some cred.

Just please, please make sure you learn what you want to teach. Don’t copy and paste, and don’t pass along partial knowledge.

A bonus tip: if there are knowledgeable people you can reach (personally, through e-mail, through IRC), then reach out and make sure you have your facts straight. Read the docs. Write code. Only then pass it on.

Development Environment 9

This is the last installment of the series exploring the components of a typical development environment. You may want to read Part 1Part 2Part 3, Part 4Part 5, Part 6, Part 7, and Part 8 before reading this one.

This is a special post, not only because it ends the series, but because it covers a tool which integrates every single component in the toolchain to try and make our work easier when developing.

The Integrated Development Environment

These are also known as IDEs, and will be reffered as such for the rest of this article.

The primary task of the IDEs is to make every piece of software at our disposal work together to make our software development easier. They usually provide unform front ends to all of our tools; so you may see that your IDE has a built-in text editor, a command to compile the code, a debugger and profiler front end, and builds static code analysis feedback right into the text editor for ease of action. Furthermore, by knowing about the programming language and tools you’re using, the file and directory structure used by the IDE is thought to make you see the important things, and the unimportant not at all.

IDEs can get a bad rap because, by automating so much and providing front ends we can’t easily see through, they can hamper the learning process of the toolchain we depend on to work, creating habits that render us helpless if we stumble upon a peculiar situation that the IDE makers didn’t cover.

Another not-so-fun fact of IDEs is that, because they’re complex and refined pieces of software, a beginner or intermediate developer doesn’t feel in reach of improving the tool. Indeed, it takes a very particular kind of developer to realize early on that changing much of what goes on in an IDE, however complex it may be, is not even close to wizardry. That’s thanks, in part to the developers of the environment themselves, who architect means for customization in the form of plugin, macro, module or other extension support.

Here’s a list of gratis IDEs. I’ll also link to starting points to customize said IDEs.

Arguably emacs and vi can become IDEs, but to do that you need to set up some plugins. If you’ve the mindset to work on those, you are already extending them, although with third party code. Keep it up.

This wraps it up for the week and the series. I hope you’ve enjoyed this light walk among the tools that help us code. If you’ve enjoyed this, please consider subscribing.

Have a nice weekend.

Development Environment 8

This is an installment in a series covering the different components of a typical development environment. You may want to read Part 1Part 2Part 3, Part 4Part 5, Part 6, and Part 7 before reading this one.

The Automated Build Tool

This is one of my favorite topics in software development, because it lends insight into the potentially exponential self-actualization opportunities of this occupation; it is similar to compilers in that respect.

I was going to digress a lot (stopped at just over 300 words into the subject) into how the automated build tools are a marvelous example of coding tools to make coding easier, but I think I will address that in another post, because the deadline looms ever closer.

So, the role of the Automated Build Tool is to run all the steps we need to go from code to binary, and it does so by using the tools in our toolchain. In order to do so it needs to know how to invoke every one of those tools, and this is commonly achieved by typing the commands we would be running time and again to achieve our tasks.

In a single execution, our Automated Build Tool may run the static code analyzer and record the stats (letting us know if something is out of the parameters we’ve set as acceptable), then run the compiler on the code (which is sometimes a very convoluted process, and almost always needs to have a particular order respected), then maybe run some of the compiled binaries on other binaries (automated tests, anyone?) with perhaps some profiling happening at some point. Running any of those steps may involve invoking an interpreter.

This kind of process is something that is tedious to run by hand, and is done very often to test code being written or modifications being done to an existing codebase, because detecting an issue now costs less time than detecting it with a bigger time investment placed into code.

The automated build tool is at the forefront of everything related to Continuous Integration and Continuous Delivery. Because the tasks are often complex, the files containing the instructions can be enormous and confusing, and less than nice structures may arise to support behavior which is poorly covered by the Automated Build Tool itself.

Now follows the usual list of gratis tools of the category. These lists are by no means exhaustive:

  • ant
  • maven
  • gradle
  • grunt
  • bower
  • make
  • cmake
  • scons

Usually it’s a good exercise for a software developer to write a proof-of-concept build tool, perhaps one attuned to a particular programming language that can solve the dependencies among files in a project and build them in order. It is useful to notice that the tools that make software development possible are themselves software development projects, and that we can make tools to ease our life and make us many times more effective.

That’s all for Today. Tomorrow, we will talk about IDEs.


Development Environment 7

This is an installment in a series covering the different components of a typical development environment. You may want to read Part 1Part 2Part 3, Part 4Part 5, and perhaps especially Part 6 before reading this one, although it’s not strictly necessary.

Yesterday we covered the debugger, a dynamic code analysis tool which allows us to pinpoint certain unwanted behavior from our program in order to fix it. Today’s tool does something similar, but works in a specific kind of unwanted behavior.

The Profiler

Supposing a program’s output is correct, it doesn’t mean that the program is the best that it can be. Profilers analyze running software and help find the places at which most time is spent.

The time spent can be because of, principally, these reasons:

  • Waiting for Input or Output operations to happen (read from disk, write to network, etc.)
  • A certain piece of code gets called very often
  • Cache misses

The last one is of special attention in many cases when our application isn’t running on its own or is running in a constrained environment. It happens because, literally, the data is stored relatively far from the processor. In the worst case scenario I’ve seen, the data has to be swapped from the hard disk drive.

The fact that much time is spent in a place doesn’t mean that it can be optimized, only that optimizations at those spots have a bigger impact on the overall performance.

The profiler is very similar to the debugger in terms of the information it needs to have about the analyzed binary and its goal. Thus, there are some programs which help do both profiling and debugging, like Valgrind.

A list of gratis profilers:

  • Valgrind
  • gprof
  • perf
  • JRockit Mission Control; works with JVM binaries
  • Firebug; works with JavaScript

That’s it for Today. Next, we’ll see the Automated Build Tools, which is likely to be the second-to-last in this series.

Development Environment 6

This is an installment in a series covering the different components of a typical development environment. You may want to read Part 1Part 2Part 3, Part 4, and Part 5 before reading this one, although it’s not strictly necessary.

Today we’ll be covering one of the tools used to analyze a running program. Unlike the Static Code Analyzer, which was covered in the previous installment, these tools help you see what is going on when the program is functioning, which aids comprehension of unwanted behaviors and rooting them out of our software. Without further ado, Today’s topic:

The Debugger

This piece of software is most used to correct a particular type of misbehavior: computations which produce the wrong result.

Just like when a child is learning to talk, or when one encounters and tries to use new words, phrases, and double entendres, sometimes we write some code which we thinks means something and it turns out it doesn’t. In software development, this is called a bug.

There are two particular types of bugs which are committed with more frequency at different stages of knowledge of a particular toolchain:

  1. Misunderstanding of the constructs we are using
  2. Misunderstanding of the process we are expressing

The first happen most often when learning a new tool, be it a programming language or some code written by another person, while the second happens when encountering a problem and not wrapping our head around the solution we mean to give it.

Unlike many of the tools we saw earlier, the debugger needs not know a thing about the programming language, because it works at the level of object code – in case you’ve not read about the compilers, that’s what a compiler usually produces. For more details, read Part 3.

Nevertheless, knowing about a particular programming language is of great help to a debugger and the debugging process, because the Compiler can include “debugging information” on the executables to be used with the tool. This includes copies of the lines of code that produce a particular behavior, allowing you to “watch” the behavior of your original variables and lines of code with great flexibility.

Without this kind of information, debugging is really painful and falls into the realm of reverse engineering.

Some debuggers:

  • gdb; it supports over a half dozen programming languages, all supported also by the gcc.
  • LLDB; a part of the LLVM toolchain, it supports C, C++, Objective-C and Swift.
  • jdb; a debugger built for Java bytecode binaries. It’s likely to work with binaries targeting the JVM, even if written in another language
  • pdb; a debugger built for python bytecode binaries.

Besides the debuggers, many development environments include debugger front-ends. In particular, the text editors emacs and vim have debugger frontends available.

This is all for Today; Tomorrow, we’ll cover the Profilers.

Development Environment 5

This is the fifth installment in a series of describing the components of a typical development environment. The previous installments and their links: Part 1 (Text Editors), Part 2 (Interpreters), Part 3 (Compilers), Part 4 (Linkers).

Using the previously mentioned tools you can write a program and run it. There are some tools that help ensure the program is correct, and if not, find out why. The type of tool we’re looking into today helps with the former, but not necessarily the latter.

Static Code Analyzer

Just like the interpreters, the compilers, and the part of the text editor which provides syntax highlighting, static code analyzers need to have knowledge of the programming language you’re using.

The main use they have is detecting common pitfalls that lead to errors and point them out to the programmer. A slightly less critical use is making sure that your code complies with a set of (sometimes aesthetic) guidelines, commonly called coding conventions.

The oldest program of the kind that I know of is Lint, which was built to statically analyze code written in the C programming language.

The reason this tools exist is that the rules that define a programming language’s structure aren’t always neatly mapped out in our minds; we make assumptions about the language’s we’re working on, and sometimes that can lead to mistakes. When we accumulate enough of these, we can write a program which reads our code and identifies them.

As they need to know about the language, they tend to be language specific. Here’s a list of some static code analyzers:

  • pylint; which is made for the Python programming language.
  • TOAD; which works with several SQL dialects.
  • JSLint; which works with JavaScript
  • SonarQube; I’m specially fond for this one, because, along with some automated build software, it made me start working towards better programming habits and more predictable code

There’s a counterpart to static code analyzers: static object code analyzers. I’ve not personally used one of those, but I guess that common pitfalls may be identifiable at this level, such as potential buffer overflows.

On the other hand we have dynamic code analyzers; this category covers profilers and debuggers – the latter of which are, incidentally, the subject for Tomorrow’s post.

Development Environment 4

This is the fourth installment in a series about the components of typical development environments. Part 1 introduces the components and the Text Editor, Part 2 covers the Interpreter, and Part 3 covers the Compiler. This part covers one of the less understood components of a toolchain, the linker. It’s easily overlooked because of the prevalence of interpreted and languages, and those which run in a Virtual Machine which don’t use it, but it’s an integral part of a compiled language toolkit.

The Linker

If you’ve been following this series, you may remember I mentioned that usually, compiled programs don’t have all the code needed to execute them.

Where’s the rest? In libraries.

Libraries are what results when compiling code in a manner that is not executable, but can be referenced from other code. The names used in the programming language specify what names from compiled libraries should be referenced for execution. The linker provides, perhaps unsurprisingly, a link between the program’s source code and the libraries. It simply connects compiled code into a single instance.

There are two kinds of uses for libraries: static, and dynamic. The static use of libraries is easiest, so I’ll cover it first.

Static Linking

The way it works is by outputting a single executable code file which contains both the program’s executable code and the libraries. Executable code files are often called “binary files”, and I’ll call them so from now on.

If you write a program which displays something in the screen, the binary produced by the compiler references the names of the libraries that contain the machine code to do it. If you produce a static linked binary, the libraries referenced and the program binary are stored in a single executable file. The resulting files are big, but can be moved from computer to computer that share the same platform with ease. As many libraries are not included in the Operating System, custom or third party libraries can be included like this for ease of distribution. If that wasn’t done, then the binary couldn’t access the code it needs to perform its operation.

Dynamic Linking

This approach works, as the name states, dynamically. When the program is compiled, it’s not merged with certain – perhaps any – libraries. This produces a small executable file which depends on the libraries being installed on the system to function.

It’s best used when no third party or custom libraries are needed, but can tie the executable firmly to a particular environment and configuration. One of the ways this can happen is when an Operating System update changes a library, which can change the functionality or introduce new bugs. In particular, some applications are certified with a fixed environment, which can make system updates undesirable to preserve the application’s certification. In other words: Dynamic linking makes for a smaller executable file footprint, but makes it brittle in ways that, sometimes, are not acceptable for the applications.

The only linker I know about is called “ld”, and it’s present in unix and unix-like systems. In windows, I’ve only ever worked with dynamic linking, which means that any particular DLLs need to be installed in the target system or packed as independent files when distributing the software.

[EDIT: There seems to be a widely available gratis Windows linker made by Microsoft, called… Linker.exe; I’ve never used it manually. Homepage]

The Rust programming language’s linker can link statically under Windows, but as far as I can tell the linker is included in the language’s compiler, rustc.

[EDIT: the compiler invokes an existing linker in your Operating System, it’s not included in the compiler.]

This is all for the week. Next time we’ll be looking at static code analyzers.

I want to take this end of the week space to invite you to e-mail or comment any doubts you may have; this articles are, of course, subject to revision.

Thank you for reading.

Development Environment 3

This is the third installment in a series explaining the common components of software development environments. To get the sequence, make sure to check out part 1 and part 2 of the series, in which text editors and interpreters are covered.

Today, we’ll talk about another component in the toolchain.

The Compiler

In a way similar to the interpreter, the compiler needs to know about the language you’re working on.

The compiler reads a file in a particular format and outputs a file in another one; this can be thought of as a translation. Compilers usually translates source code to machine code.

The part of the compiler that generates output from the source code – as opposed to executing it as the interpreter does – means that the compiler needs to understand how to write a file that can be executed in a particular environment.

It is common for compilation in a particular environment to produce a file that can be executed in that same environment, but not necessary. When a compiler produces a file that can be executed in another environment, the activity is called “cross-compiling”, to signal that the file is aimed towards a different environment.

In short: A compiler is a program which knows about the programming language you are using and the environment in which it the resulting program is supposed to be executed.

As is the case with interpreters, there can be more than one compiler for any particular language and targeting any particular environment. Here’s a list of some gratis compilers:

  • gcc; this is a widely used compiler collection. It’s a program which can compile from several programming languages and into several platforms. Some of the languages it works with are: C, C++, Objective-C and FORTRAN. It can produce executable for many target environments as well.
  • clang; this is a compiler aimed at the C family of programming languages, and can work with C, C++, Objective-C and Objective C++; it supports several target environments, notable x86, x64 and ARM processor architectures, which most computers and cellphones use.
  • fpc; this compiler is aimed towards the Pascal language and its variants, including Object Pascal. It supports many environments, so much that the “write one, compile anywhere” philosophy is a core idea for the compiler.
  • javac; this compiler is aimed towards the Java programming language, and targets the Java Virtual Machine environment. It’s notable as part of the category of compilers that don’t have a physical machine as a target, but a virtual machine which can then be written in another language and compiled for any desired target. Thus, writing and compiling once would allow a Java program to run anywhere a Java Virtual Machine was compiled to run beforehand.

There are some caveats I’d like to make at this point:

  1. Execution environments are commonly called platforms, and are usually the combination of a particular processor and a particular operating system.
  2. I stated before that the output is usually an executable file. This is not precise information, merely a simplification. The full explanation is that the compiler usually outputs an object code file. It so happens that object code is the code that can get executed by the computer, but it isn’t necessarily formatted in a way that the operating system executes it directly. When not producing executable object code, the compiler is usually producing referable object code.
  3. I said a compiler usually produces object code; but because of the nature of the compiler, taking from a source language and producing some executable output file, when a program produces “bytecode” (object code for virtual machines) or it produces a source file for an interpreter to use (in which case it can be called a transpiler), it’s still a compiler.

Now, object files don’t usually have all the machine instructions that need to be executed in order for the program to function; why this happens and how this fact fits with the successful production of executable programs will be explored Tomorrow, when we cover the linkers. The answer is related to the non-executable object code files that the compiler can produce.

This is all for Today. I hope this was entertaining and informative.

Development Environment 2

In the previous installment, we saw a list of types of tool used in a development environment, a bit about the purpose of the Text Editor, and a list of existing gratis text editors for several platforms.

After you have the source code, (i.e. text files describing what your software is supposed to do in a particular language), you still can’t execute it. One of the tools that help you make your code executable is the interpreter.

The Interpreter

The interpreter is a tool which understands a particular programming language and can execute the instructions. So, if your source code states that a picture should be shown on-screen, the interpreter shows the picture.

Basically, it’s a program which you ask to run through your file and do what your file states should be done. Many programming languages have interpreters, and sometimes a single programming language can have many interpreters. Some languages can only be worked on with interpreters; those are called, rather unsurprisingly, interpreted languages.

An interpreter usually contains either a lot of functionality, the capacity to use external libraries to provide a lot of functionality, or both.

A short list of gratis interpreters follows:

  • python; As well as being the name of the programming language (Python), the interpreter is called python. This naming is typically used for the old version, i.e. Python 2.x
  • python3.5; that’s the name of the latest (as of today) stable Python interpreter for the 3.x version of the language.
  • IronPython; an interpreter for the python 2.x language which uses the .Net CLR
  • node; it’s an interpreter executable based around the v8 JavaScript engine
  • perl; it’s an interpreter for the perl programming language
  • racket; an interpreter for the racket progamming language
  • lispy; an interpreter for a LISP written in python

As you can see, it’s common to have the interpreter executable and the programming language share a name, and there can be several interpreters for the same programming language. This is because the interpreter provides the facilities for the code to run; different interpreters can provide different facilities for the same code to perform the same function across diverse conditions, like different installed Operating Systems.

Unlike with the text editor, each tool of this step is built to work with a particular programming language. It is so because the tool needs to know about the language to be able to understand the source code, but the text editor needs only to understand text input and how to store plain text files.

Usually, the interpreter has no output. There are two notable situations in which you have output:

  • Generating output is the interpreted program’s intent, which means it’s the program’s output even if it’s generated through the facilities provided by the interpreter
  • Things go wrong; in which case you get error messages

Things can go wrong in many ways, but chiefly:

  • a mistake was made when writing the program’s text, such as a typo
  • a rule of the programming language was broken
  • the nature of the program being run makes it not work properly (for example, having a loop that never ends and never does something productive), in which case it’s not a problem directly related to the interpreter but to the written program

This track of the toolkit (editor -> interpreter) doesn’t have any more dependencies to produce running software. Just the text editor and the interpreter. If an interpreter can run in several platforms (as is often the case), it’s very likely that the same code can run in those same platforms. Other tools can be used on the code and running program, like the debugger or the static code analyzer. Those tools will be covered in a few days.

Tomorrow, we will cover the compiler.

Development Environment 1

Since I started cross-pollinating with people who learned in a different environment from mine, I remember questions and remarks similar to this:

I’m using DevC++, what compiler are you using?

This evidences a lack of knowledge about what makes a development environment… a fragility in the ability to get things done that hangs for dear life on a particular way to do things with particular tools instead of knowing how the pieces fit together and being able to mix and match. Hopefully a bit of context can help fill the gaps.

This is the start of a series of posts aimed towards anyone who wants to start developing and needs to make sense of what they need to start developing software and how the pieces fit together, so that a particular tool or collection of tools isn’t what defines the reader, but what helps them be productive in a particular environment and circumstance.

The tools most commonly used in developing are: a text editor; an interpreter, or a compiler and linker; a debugger; an automatic build tool; a profiler; a static code analysis tool.

This series will not go in depth on how the tools work. Instead, I hope to give an overview of what they do, and some examples of known tools in the category. Almost every category of tool used in software development depends on the output of another one to work; thus the combination of tools that get you from code to finished product is called a “toolchain”.

Some programs bundle many or all of a particular toolchain together; some even let you swap tools within a category to better fit your needs and preferences. Those are called IDEs (Integrated Development Environments), and I’ll cover them last.

The Text Editor

First of all, with almost all certainty, your chosen language’s toolchain starts its with plain text. This means that you can write the code for it in any text editor. Many text editors provide “syntax highlighting”, which means that they understand enough of what certain programming languages mean and highlight the different elements of the source code for ease of reading. Just to be clear, the source code is what you actually write in a particular programming language to specify the steps that the final program needs to execute.

Here is a brief, vastly non-exhaustive list of gratis text editors:

  • TextEdit, the out-of-box MacOSX editor
  • Notepad, the out-of-box Windows editor
  • Gedit / Leafpad / Kate, the out-of-box editors which usually install with GNOME, XFCE and KDE desktop environments in Linux
  • Notepad++, an editor available for windows with some good features aimed at software development
  • Emacs (My editor of choice), a multi-platform and customizable text editor with its own learning curve. I recommend if you’ll use this put some time apart to learn the basics, then start learning to program, then put some time apart to learn to ease your programming using external modules.
  • Vi (I’ve not used this one extensively; it’s most used incarnation is Vim), is also multiplatform, also customizable, also has its own learning curve. It’s style is very distinct from Emacs’. As with Emacs, though, I recommend you set apart time to learn it properly before heading into progamming, then learn to customize it with plugins to ease your programming. It’s particularly important in that it’s standardized in POSIX, so you can expect consistent behavior.

The output of this program is a text file with source code, which is then processed by either the interpreter, the compiler, or the static code analyzer.

This is all for today. I expect we’ll cover the interpreter Tomorrow.