Wednesday, June 08, 2011

Conquering Excel Macros

VBA is a terrible language. It's archaic and idiosyncratic. I recently got back into programming in VBA since yesterday to help out a friend at work to process data faster. It would take him weeks to what this program will be able to do in minutes. The only kicker is that it costs me time to write this code.

I've transitioned over to Python as my main programming language about 4 years ago and I've learned a lot from it since starting. Functional programming and the map, reduce strategy to computing data. Basically, if you can write out a function to process one block of data, it's just a matter of looping through the rest of the data array to calculate everything.

What python (and several other languages)  do is obsolete the loop and make them implicit. If you have a function and you have an array, then there is way to apply the function to all the elements of an array with one line of code. None of that

  • let's create counter,
  • create the loop structure,
  • create an output array,
  • pass the data to the function,
  • dump the output data into the array,
  • and increment the counter to get to the next piece of data

process. You have a spoon and there are many buckets of ice cream, what else do you need to know? Functions and list comprehensions work just like that. In a single line of code that 6 step process is gone. Code that can comprehend lists are amazingly short and the hard thing to do it limited to writing the function.

I work in Python and I think in Python when I program. Then I come back to VBA, it doesn't have the syntax to do that. What usually took me 1 short line of code takes me 4~7 lines of longer code to do the thing. It's a waste of time and prone to error.

So basically, I've spent the last 2 days making VBA more Python like by creating the data structures to alleviate the lower level management of data array comprehension to the point where I can throw arrays around and process them in 2~3 lines of short code. It's not a perfect solution, but much better and far more manageable. Compared to the old procedural code that I used to write, managing old procedural code is a nightmare to maintain. The great thing about list  comprehension is that you can worry less about the state of the program because the code becomes stateless. None of that "what was the value of the counter?" and that pizzaz.

There are times when the state of the program is important however, like for instance a user application: What screen am I on, or what settings have I changed, or am I still connected to the internet or something like that. But still, the overhead of dealing with states associated with data greatly decreases with list comprehensions (there will be cases where the data will require state changes in a function, but the overhead is greatly reduced).

At the time, instead of having multiple lines to describe a process, you can have just one which is far easier to understand because the unimportant scaffolding is hidden; the only thing showing are the important parameters and the name of the function. That is how good code is written. So farewell to the terrible looping structure, I've gotten rid of most of them and now left with the descriptive code that tells me what it does and what it operates on.

That's the beauty of higher level computing languages, in that you're able to do a lot with saying a little. There are even higher level ones where you can define your own keywords and syntax to them. A not very well known language is Lisp that allows one to do just that. I have no idea what is possible when one is able to define one's own language to suit whatever problem they are tacking with but I am quite sure it would be a very fascinating adventure. I've already seen my programming skills improve greatly by using Python. I can only imagine what else is there to learn by learning higher level languages.

No comments: