Are You Smarter than a First Grader?

When do children grasp the basic concepts in computer science? When can they learn to program computers?

Children are not all the same. I have known several people who were reading adult literature by the age of 18 months but most children go through stages of learning at certain ages. They learn to differntiate different objects, they learn the rules of gravity and balance, they learn words, they learn to speak, they learn to read and write, they learn math, they learn proceedures, and sometimes much later they are taught computer science.

Some scientists noticed that the use of a computer keyboard had assumed understanding of an alphabet and written language. When the computer mouse was introduced it brought the age of the GUI (Graphic User Interface). This made using a computer much easier as it lowered the bar below the level of understanding written language and let someone who could visually differentiate objects and point to them control a computer with a point and click interface. Scientists noticed that their children as young as 18 months who had not yet learned to talk let alone read could turn on a Macintosh computer, point to an icon, click on it, and play a game. They might actually be able to "program" when a computer could capture their point and click actions in a sequence, record it, create a script, and let them point and click at an icon, a picture, that was linked to this recorded script by other software. And starting in the eighties this level of point and click computer programming via a script was brought down to the level of human infants.

In 1951 groundbreaking research in the development of the understanding of children was published by the Swiss psychologist and philosopher Jean Piaget showing that when a child is younger than seven if you pour water from a tall narrow glass into a short wide glass the child will say there is less water. If you pour the water back into the tall glass they will say there is more water. Young children equate the amount of water to the height of the water in a glass and don't yet realize that the amount of water is always the same in this demonstration.

Children learn eventually that the amount of water is the same in that demonstration. They also learn things from playing with toy blocks. They learn to stack them, they learn how to pile them up in ways that do or don't balance. In the process very young children learn how a stack works. You probably remember when you learned about stacks of toy blocks.

They learn that they can build a tall stack of blocks and that a stack in accessed from one end, the top. They can add a new block to the stack and it becomes the new top while the block below that was previously the top of the stack that supports it. They learn that they can remove the block from the top of the stack and unless it was the last block still have a stack.

They learn that they cannot add a block to the bottom of the stack or the middle of the stack or remove a block from the bottom of the stack or the middle of the stack in one step or it will likely all tumble to the floor. They know that if they cause the stack to deconstruct this way they cannot juggle all the blocks in the stack in the air and keep it a stack before the blocks scatter on the floor. Knocking down the stack and watching the blocks scatter into something other than a stack is fun and educational too.

Young children understand the realities of a finite stack and that it is a structure accessed by one end if they want to keep it a stack. They can add or remove blocks at the top changing what is on top of the stack but they cannot access the middle or bottom of a stack without it crashing to floor in a mess and no longer be a stack of blocks.

If one reads the definition of a stack in computer science it is compatible with the understanding of very young children. It is a structure accessed at its top. One can add something to the top or remove something from the top but not from the middle or bottom. But with a proceedure one can access the second element in the stack by temporarily removing the top item temporarily making what had been the second element in the stack the top then replacing the item that was previously the top of the stack.

Children also learn that they can put blocks in a horizontal line on the floor and access any of them. This is known in computer science as a one dimensional linear array.

They also learn that they can arrange the blocks in multiple rows and access any of them. This is known in computer science as a two dimensional array as the floor forms a two dimensional surface known as a plane.

I learned addition, multiplication, and long division in grade school. A few years later in grade school I learned the proceedure to take square roots. About that time they introduced what they called "new math" as part of the effort to teach children math and science to be able to compete in the days of the cold war. This new math included set theory, logic rules, and Venn diagrams. It let the children understand the math used in words like AND, OR, NOT, and NOR (NOT-OR).

In this Venn diagram the set of all C is in both the set of all A AND the set of all B.

By the seventh grade I had learned more math. I began to understand how the algorithms worked and how the proceedures worked for long division or square roots. At that point I really wished someone would have shown me those things in the first or second grade rather than only having been taught the proceedures by the rote method with no real understanding of what I was learning. Once I understood how the proceedure for doing square roots worked I was able to do cube roots or higher power roots in my head. By the tenth grade I was practicing doing things like seventh power roots of ten or twelve digit numbers in my head and wished someone had shown me this stuff when I was half that age.

As an adult I came to the conclusion that teaching children math the way it had always been done, like forcing rote proceedures into kids brains to the tune of a hickory stick was not the best way. Children were capable of understanding a lot more a lot earlier than that. I came to the conclusion that they should learn the basics of computer science before they learn to multiply and divide. They should understand the algorithms when they learn a proceedure like long division.

But this is not the way children are taught.

I have noticed that many adults, computer programmers, and professional programmers have forgotten the basics they learned when they were very young children. The recent trend to make using programmers a cheaper commodity for business purposes has resulted in a trend to popularize programming in scripting languages where only a few programmers have to understand the basics of what is really going on and most of the scripting programmers can think like a child as young as 18 months old and just point and click to try to let someone else's program make the program work.

If you talk to someone who only wants to pay for the cheapest and lowest level script programmers this is a good thing. If you talk to programmers who do more sophisticed things in computer science, or to teachers, they may lament this dumbing down of the education that the current generation receive.

One of the most obvious problems is that many adults have forgotten the things they learned when they were a few years old and learned how a stack works.

The computer language Forth is a stack based language. Most things are done on a stack, which by definition is accessed from the top. Proceedures can be nested with each proceedure calling other proceedure and be able to return to the calling proceedure. Forth keeps track of this nesting with something known as its return stack. Data used by the proceedures is kept on a data stack with the idea that the data that is wanted at any time should be on the top of the data stack.

That approach makes the language the most simple, the smallest, the most compact, the most flexible and expressive computer language I have encountered. When done well it makes programs a hundred times smaller and lets them run on computers a hundred times smaller, cheaper, and lower power than more common computer languages.

Why should anyone care about that? The amount anyone can do depends on how much time they have to waste reading and writing programs that are a hundred times longer than they need to be. This also determines how much the computer cost, what things it can do economically and how much power it uses.

As a society we all pay for the hundreds of billions of dollars of electricity just used to power Personal Computers, PCs. The prohibitive cost of these machines limits their access to few percent of the population and we can't use them to education more than few percent of the next generation.

Personal Computers make up only a percent or two of the computers made each year. Most computers today are "embedded" computers, those built into some device to do a specific job. When I was in college in 1971 my Physics teacher told us that the invention of CMOS circuits that used voltage to do logic rather than power, and the invention of LSI, large scale integration of logic on one chip, and specifically the invention of the microprocessor was going to change everything around us. He correctly predicted that before long an embedded computer would be cheaper than a mechanical switch or a few inches of wire and they would be built into every washing machine and kitchen appliance. He prediced that it would even lead to individuals having Personal Computers sitting on the desk at home or at work.

Meanwhile the people teaching computer science were very skeptical that these new microprocessors would ever amount to anything and questioned if they were even really computers at all. After all they knew computers filled up rooms, needed enomrmous power, and were so hard to program that only they could do it. They were wrong of course as we all know. The average American household contains dozens if not hundreds of small embedded computer and perhaps a few PCs. Some cars today have more than a hundred computers in them. People all over the world use cell phones and appliances and other consumer devices that account for a market of many billions of new computers being made every year.

But most people are not very familiar with the design or programming of these embedded computers. The rule on these machines is that the goal is to make them with as close to zero cost and as close to zero power usage as possible. To do that they have to use good mathematical programming to keep programs, memory usage, cost, and power consumption to a minimum.

The Forth programming language and embedded Forth computers remain at the forefront of this effort. They have been on the leading edge of performance/price and leading the way in simple parallel programming and multi-core designs. The patents developed have been licensed for use in big PC microprocessors, embedded computers, and most electronic consumer devices produced by companies throughout the world. But most people don't know much about it. They may not know that people could buy a chip for $10 with 144 complete computers on it with each one running at hundreds of megahertz in parallel and using less power than almost anyting else. My website has been all about understanding the history of this and how it happened and how it works.

Many people seem to have forgotten the most basic concepts from the real world like how a stack works. Some popular languages that make things a hundred or a thousand times more complicated than they need to be may call things stacks that are not stacks at all but actually are linear or multi-dimensional arrays in random-access memories.

A first grader knows that a stack is not something where one can randomly access the middle or bottom of the stack. But many adults these days don't seem to remember or understand this concept that children learn before they learn to read or write.

Many people know that when they use more hardware, more power, and more memory than they need they can stick to programming with inefficient scripts. And they have learned that they can build expensive arrays in random-access memory and use them as stack or as an array. They seem to have forgotten however what they learned as small child that stacks are something much simpler than all that.

I am often asked, "Why can't I access the bottom of big stack of things first?"

The answer to that is well understood by very young children. Then it would not be a stack.

People can build a more expensive and more complicated structure in more expensive and slower random-access memory as a stack sometimes and also treat it as what it really is an array in expensive and slow random-access memory to access the bottom of this virtual stack. They may forget however that this isn't really a stack at all.

For instance the computer programming language C does not have a data stack and a return stack like Forth. (*1) Because it does not have a data stack and a return stack C typically uses an array in random access memory instead. It has stack-frames, often incorrectly called just stacks, that are actually arrays in random access memory used to hold both data and return information for proceedures at the same time. These things can be accessed via pointers into the array. These stack frames are far more complex, slower, bigger, and more expensive that actual stacks like used on newer stack machines and with the Forth programming language.

The first grade question is, "True or false? You access a stack of things from the top."


(*1) ANS Forth excepted as it is an 18 year old programming language standard made to embrace both Forth and C language features. It has something that is called a return stack but which may not actually be a real return stack used for nested proceedure return information at all. The standard says one can only count on it to be a temporary data stack. And it has something called the data stack which is likely not really a stack at all as it must really be a linear array in random-access memory like the stack-frames in the C language to be complaint with words (definitions) required in the ANS Forth Standard document.