The Annual Forth Day meeting of the Silicon Valley Chapter of the Forth Interest Group was held on Saturday 11/14/98 and included representatives from the Sacramento and North Bay FIG Chapters.
I (Jeff Fox) was not able to attend the morning so I was unable to see the presentation that Michael Montvelishsky had prepared about the profile of the hardware and software resources in the iTV embedded products and the contrasts between the implementations of web browsers and email programs on PCs and at iTV. Of course I knew what he was going to say. I arrived for the bbq which had been provided by Dr. Ting.
Michael had set up the 5 inch B/W TV/FM w/ iTV internet appliance and was giving demos of the features implemented in the OS and email and browser programs including an interactive Forth interpreter task and letting other people try the product.
I volunteered to give a quick presentation after lunch and informed people that we expect F21d prototypes back this week and that I had released the latest F21 Emulator and demo programs for free at the web site. www.UltraTechnology.com I explained that emulator was now getting up to 10 F21 mips which was about 1/20th the speed of F21 in the best case for F21 and 1/2 or 1/3 of the speed that F21 will get in programs that access DRAM and encounter offpage memory timings. I said that even on my fairly slow laptop the F21 video demo in the emulator was now almost too fast to see now.
The last presentation was Chuck Moore and his Fireside Chat. I made a video of the presentation although Chuck used an overhead projector most of the time so neither the lighting nor sound are ideal. I have working on a transcript and when done I will convert it into an html document and post it at my site.
(Chuck) The first thing that I want to talk about is this slide. It's kind of neat. It was constructed with my editor. That's how I work all day making things that look like that. So it's very easy to construct a special purpose list of comments that I can show off. So Color Forth comes with a built in slide maker. (laughter)
FORTH DAY CHUCK MOORE COLOR FORTH COMMAND LINE HIDE/REVEAL SPREADSHEET I21SL TEMPERATURE IV CURVES INTERNET PRINT(Dwight-) They all say PRINT at the bottom.) (laughter) (Chuck) I was going to mention that. There is a little problem or feature. You can't really see the difference between the black on Forth Day and the blue on Chuck Moore. This is literally a screen dump onto an HP printer. I could work on the colors and make things legible and I really need to. Everything that I am doing and everything that I will show you today is a work in progress. The priorities are such that none of this is mainstream. But I am working more than ever in Color Forth, and I really like it.
The i21sl is the latest version of the chip and it's been back for three weeks. (someone-) How do you tell the difference between a one and an I? (referring to Chuck's CAD font in OKAD on the screen dump) (Chuck) I can't. But it really doesn't matter anyway, but I will be designing a new character set that will make that evident and a utility program to do it. (someone-) What kind of keyboard do you use? (Chuck) A normal PC keyboard. (laughter) (and people say that I just make this stuff up. This time I have a video ;-)
Here's a screen full of Color Forth. (You need a color browser to see the colored text)
T3/2 TABLE REMEMBER IT : 65 LOAD TA 294 ; FILL -1 + TA OVER TA + /. 3/2 OVER T3 /2 ! IF FILL ; THEN DROP ; 100 FILL IT E ND INT P 64 p 63 .obscured by projector.The First thing I want to talk about is the bottom line. The paper that I will give at FORML is the new command line. I have not heard anyone else mention it to you so I will, but it scrolls from the right. Normally you type characters starting from the left and move across the screen with a cursor. This doesn't. The characters appear on the right and scroll to the left. There is no cursor. I don't need one. The advantage is that I get a whole line of history with no need for vertical scrolling. The beauty of that is that is that it doesn't interfere with whatever is on the screen. Whatever is up here stays there. It's just the bottom that's used for command line scrolling. Before I had two lines for history and it scrolled up one line and I had a little window with one line of history. But this is even better because it turns out that I get even more history this way than I did that way and it takes less space.
I think this is a great example of conserving the valuable resource of the pixels on the screen. (laughter) It's a resource that nobody cares about. I saw an ad in one of the internet magazines. I don't remember what browser it was but it was a browser framed that you had put up on your screen and customized all these buttons and dials and that seems to be the pace at which this world is going. They're willing to give away the edges of their screen (motioning to show that the useful browsing area was surrounded by all these custom buttons and gauges in this ad) I saw another example. It was some application I was working on. It had something to do with word processing that had three layers of frames and in the middle in about 1/8th of the area on the screen available for text was part of the text that I wanted to see. It's ludicrous that their willing to give away most of their 1024x768 pixels and I'm not.
Now this (the Color Forth Screen), I saw a word earlier, MARKER ? Is that ANS or something? REMEMBER has gone away? all right, I'll use MARKER. It's shorter anyway. I'm trying to shorten words.
Here's an application that builds a table. It's a table of temperature to the three halves power. This is the code to build it. And the thing to notice is that the word FILL is referenced inside of itself as a jump back to the beginning of the definition. This is relevant to the zealous debate that's been raging on the standards committee. It used to be called SMUDGE and I guess the debate is about what to call it. But anyway I've given up on that. It's just too complicated. Color Forth is brutally simple and it will become even more brutally simple. This construction of a jump back to the beginning is very convenient and it saves a lot of BEGINs and SWAPs and confusion. I think the control flow is clearer here.
Every word here has a space preceding it that describes exactly what it is supposed to do. This little character here that turned this word blue means that is is a comment and the next word is also blue. This word is black on this page and white on the screen and it is executed. This word (IT) is being defined. Now remember that the word REMEMBER says that the next word you define will have the DOES> behavior. And a word being defined is red so I am defining IT at this point. Having done that I use a colon to switch the behavior back to that of normal colon definitions in future definitions. Of which there are a bunch on block 65 and then this word TA. Now the body of the definition of TA is in green. And this 294 and the semicolon makes this a constant. That is how I define constants in Color Forth. It's the most efficient definition. It avoids this long word CONSTANT that doesn't convey a whole lot of information.
(George- asks about the colon in the definition.) (Chuck) The colon is the specification that future red words are colon definitions. At define time the address is that of colon rather than that of remember. I'm using the function keys. I've given a color to each function. So this acts like a constant.
(John-) A DECIMAL constant too. (Chuck) Yes, and the word IF here should be black and THEN too because those words are executed. And likewise outside of this definition I switch back to black and execute these words. 100 FILL executes the word. IT forgets everything from here (REMEMBER) and END marks the end of the block. Other than that this code is probably very self explanatory. Here is a -1 +. Which is the way I do things. I do not have a subtract. This is partly experimental. I want to decide if subtract is an essential operation to have. The answer is no. If you are doing comparisons you use exclusive or. If you are doing arithmetic you use minus numbers and very occasionally you need to do an actual subtract.
This code is actually used by that word.
SQRT 1. 1FF. ; *. 1. */ ; /. 1. SWAP */ ; 3/2 DUP DUP *. *. *. SQRT DUP 1. - 1 + + 2/ 1. + SWAP OVER /. + 2/ ; ENDThis is a square root which I would like to talk about as a square root apart as an example of Color Forth. Here is a cyan constant 1FF. One of the things that pleases me most is the colorfulness of the screen. It is real dull to look at a black and white screen. That's why I willing to add more colors. Here's an example of where I want to put in more colors. I'm not there yet. The first couple of words here are fixed point arithmetic. (1. *.) I am defining this (1.) as the number 1 dot and it's got 511. And to multiply two fractions you do a 1 star slash. To do a divide you do a 1 swap star slash. It's not the most efficient way of doing it, you should do a multiply and a shift but this is the easiest way of doing it as a fractional multiply. I use that in a square root. I am taking the square root of a number near 1 scaled by 512 and getting a number scaled the same way. You do that by basically taking one minus one half x to get the square root of one plus x. And then I do a Newton-Raphson iteration to get four significant digits. Now I've got a square root good to nine bits, a very cheap quick and dirty way of getting some square roots. The reason I wanted a square root is that I am taking T (temperature) to the 3/2 power. I do that by T cubing and taking the square root. The cubing part is just DUP DUP *. *. *. and that falls into square root because the headers are elsewhere. I'm very fond of this style of having a couple of words cascaded and falling into another word. It is in effect multiple entry points into a word. (Mark) Shame on you. (laughter) (Chuck) I get multiple exit points as I had in the previous word where I had IF FILL semicolon. Semicolon does not mark the end of a definition. The end of a definition is marked by the absence of any more green words.
Here this one dot minus one plus is an awkward way of computing a number. I get this number 511 and this is not the binary subtract operator it is the one's complement operator. (like in the MISC instruction set) This almost makes this into minus 511 but not quite you have to add one to it first. Now my current thinking is that if I made this yellow it would be executed at compile time and the result would be put on the stack and would have to be compiled as a literal. I've been worrying for a year or so about how to do that because I don't want to have to say LIT. It is an ugly word with no meaning to say put the literal on the stack into the definition. If I make this yellow the stop of yellow or the switch from yellow to green will make that function.
(Jim) Now I have a question. All this stuff here is constants or stuff that can easily be determined as only affecting stuff that is only on the stack during the definition. Why don't you have the compiler figure out when it would be a good time to inline stuff?
(Chuck) That's what I'm trying to avoid. Because the compiler has to figure it out every time it does it and I can figure it out once and for all and be done with it. I'm planning on compiling these applications every time I am going to use them so my dictionary will always be very small. I won't have any need for vocabularies for instance. And I want the compiler to be very fast. And decisions that I can make at, we don't have a name for it, programming time are cheaper than decisions that are made at compile time which are cheaper than decisions made at runtime. And besides it adds another layer of color.
Now if I have a word in here that is not a number, and the operator is yellow it will executed just like it were white. But I think it will be more colorful in yellow.
(question: Do you hit a function key to set a color and then that color applies until you hit another function key?) (Chuck) That's the way I used to do it and that is kind of the way I am doing it now and I don't want to. I want there to be an explicit character. The space bar might work. I'm not clear at the moment on how to use the space bar. the space bar at the moment generates a space which has no color information. I'm tempted to make the space bar generate white. Which is perhaps the most common use. But these change from week to week. It isn't hard to change things.
Here's another block. Blocks are 256 characters long. In this case there is no END. It falls off the end and goes into another block. Keeping things arranged by blocks is convenient in that I've got the whole thing in front of me at one time.
TEMP EMPTY 65 LOAD VARIABLE H NO : ORG H ! ; H! DUP PUSH @ FFFF0000 AND OR POP ! ; . H @ H! 2 H +! ; KS 14 FF * ; TA 29 4 : DT SWAP 10 NO @ */ SWAP OVER KS */ AT DT TA + TA SWAP /. 3/2 /. ; D DT SWA P DROP 66 N 67 F 66 PRINTOk, here's a comment, TEMP. Here's a use of VARIABLE. It specifies that the following words are variables. And here's two variables, H and Number. (NO) So I can define a bunch of variables in a line without repeating the big long word VARIABLE. And then Colon says I am changing to colon definitions (at ORG). Here I'm defining Origin and H! and H , it is a little compiler. I'm compiling stuff into memory, what I am compiling is actually .. I'm getting ahead of myself.
So I have mentioned command line hide and reveal and now I'm off on spreadsheet. This is actually the code for spreadsheet. This what actually does the work.
Let me show you what a spreadsheet is. This is
a spreadsheet. It is another example of color mapping. Black doesn't show
up well on top of blue but it does show up well on top of all other colors.
-25 235 -25 240
These five numbers describe two sets of curves. These are the IV curves for the current LG process. These are N transistors and these are the P transistors. You can't quite see it but around that number (5 in 15) there is a ring. Indicating that that is the number that will alter if I push the arrow keys. Right or left arrow will move the ring and up and down arrows will increase or decrease the value stored there. So I can sit here and adjust these numbers and watch the curves change.
There are dots in these curves that represent the measured values. For one block of source code I have a very convenient method for displaying numbers and changing their values. To me that is a spreadsheet. I guess spreadsheets are a lot more elaborate than that.
This is the source code and this has been on my mind for a long time. I am very pleased that I was able to do it and that it was simple as this. Here I am just putting up some color backgrounds so that does away with labels up here too.
UAN VTN UAP VTP N+N N+P NLN SHN SLP SHP IV ENDThis block, these are Forth words. They are interpreted and as these words are executed they execute a version of dot that puts out a number on top of them. So the location of the word indicates where the words value will be displayed. The color of the word gives you some hint as to what the meaning is. It is very easy to edit a block like this to represent the data as I want it to appear.
This is the word IV that draws the IV curves. And then END marks the end of the block. So you see everything is here and there is correspondence to the layout. (on the previous overhead) So far I have constructed two of these interactive screens and I figure I will do many more.
Originally I wanted these numbers to be right justified and I picked the word size so that it would line up nicely on the right. But in fact I never got around to writing a right justified output and it really doesn't matter.
That is Color Forth, and that's probably enough said about it. It keeps evolving. I am using it in the context of chip design. Chip design has the priority. I add features as necessary somewhat guiltily but doing the spreadsheet thing has made it easier for me to do some experiments. You have to be careful about making it easier because you have to have a payoff.
The payoff is this. The i21ls came back about two weeks ago and almost works. This was on the LG fab line as opposed to HP. I've spent five years practicing for HP and this is the first time I've done LG. The processes are not that much different but their different enough that a chip that worked for HP didn't work for LG. The question is why. The answer is the IV curves.
I've probably showed you these before. They are similar to all IV curves. There are two changes I've made. The first is the slope of this line. This is the curve for five volts. This curve is much flatter than it would be for four volts. At four volts it would go over this way and then down. At three volts it bends over even more strongly. If you're familiar with the textbooks you know the curves look very different than this. They show curves that are flat and come down. And that flatness, is a lie. This is the shape the curves have.
This is a one tile wide N transistor and this is a two tile wide P transistor and this says that the ratio N to P is larger than one to two. That is no suprise because this is process independent, but this disparity is extraordinary. Again, the reason for it is temperature.
We measured these IV curves because I didn't trust HP to measure them for me. I don't trust LG to measure them for me. I want to know what they really are. So we have got a little bit of circuitry on the chip to let us measure this. And the measurement looks flatter than his also.
But when we measure the transistors they are hot. Current is flowing through them. Heat is being deposited in the channel of the transistor in a very small area. So these transistors are running a hundred degrees above average. And the resulting curve if they were running at ambient would be this. When I take out the temperature effect the curves increase by about 50%. 50% is a huge difference. A 50% difference more than what I had assumed when I designed the chip. This effect is unknown to almost everybody. You're the only people who know. (laughter)
The reason for this is that everyone measures their transistors hot. This has been hinted at in the literature. Some people have said that they know they are measuring the transistors while they are hot but what I've never seen anybody do is convert the curves back to what they would be colder. When they are cold they are much more systematic than when they are hot. If you take this five volt curve and drop it back to hot you get a big correction. You get a much smaller correction at four volts and virtually no correction at three volts. That is because the energy you are depositing is proportional to voltage times current. Cut the voltage in half and you go from a significant effect to a negligible effect. It's a negligible effect in P transistors since their current is less than half of that of N transistors.
So I was pleasantly suprised to see such a large effect. This accounts for all of the observed disparities that we observe in the chips. For a long time we were plagued with circuits that would work at three volts or four volts but not at five volts. Nothing that I could do would make them not work (in OKAD) at five volts. Doing this they don't work at five volts.
I feel we have reached the truth at last. I think last year I said that I thought that had found an effect to account for this and I hadn't. Conceivably I'm wrong here too but it looks good.
(Dr. Ting asks a question about chips running 100 degrees above ambient) (Chuck) The chips are running three degrees above but the transistors in a very small region on a very small transistor is burning hot. It gets hot almost instantly. As soon as I turn it on the temperature ramps up ten degrees. If I leave it on it will ramp up twenty, thirty, forty, eighty degrees.
(Another question from Dr. Ting) (Chuck) Yes, we have an oscillator on the chip that I can switch from one configuration to another and with an on-chip counter to measure the period of that oscillator. I can display the period on the screen as a spectrum. And can get multi-line spectra of 18 values. If I adjust these curves to fit those values then I trust the simulation.
The temperature of a transistor cannot be measured. We've thought of a lot of ways. You can take an infrared viewer but the resolution is so course that you just get a blur where the transistor is. Using the simulator we are seeing things that are unobservable. This harkens back to quantum mechanics where all of the interesting things in quantum mechanics were deemed unobservable by Hiezenburg.
We have six million atoms in the part of the transistor that heats and that is enough to (noise). I have have implemented this temperature effect several times and I have gotten it wrong every time except the last one. (laughter) And the characteristic is, and this is interesting for anyone who wants to control ovens that the phenomena of temperature rise is distinctly different than that of temperature fall. Temperature goes up because you are pumping heat in and as you pump heat in temperature is proportional. Heat is proportional to temperature. The more heat you add the higher the temperature. It will reach an equilibrium where it evens out with fall.
Putting heat into something it will heat up then it will cool off by diffusion. Which is a one over square root decay of temperature. It is a much longer curve than the heating curve.
Now we've observed some behavior on these chips that was totally inexplicable until now. The fact that if you in fact put four instructions in the same word as we can do they wouldn't work. If we put them in four separate words they work just fine. What is different in quick succession and running them delayed by forty nanoseconds? They get hot. And that says that in two nanoseconds the temperature did not decrease substantially so you ratchet up. You get a clock pulse and it goes up and then two nanoseconds later you get another one and it goes up again and after four of these steps you are twenty degrees above ambient and things are acting differently than they did when you were colder.
This is a problem but it is also a feature that we should be able to exploit in some way. The only thing I have come up with is that it is very difficult to reverse engineer this chip if you don't have the proper temperature behavior in your tools.
(Question: Why doesn't this effect other commercial CPUs?) (Chuck) Because they have about a 500% margin. They wait a long time for these effects to dissipate before they do something. This is particularly dramatic for things like gate arrays. They are running so slowly, they are so conservative that they never encounter anything like this.
(Question: What do you do to fix it.) (Chuck) You make the transistors bigger. (laughing to himself) It's a sort of knee jerk reaction to anything, you can fix it by making the transistor bigger. The only thing that this tells me is which transistors and how much bigger. But not knowing that it was absolutely hopeless. I just had no idea.
(Question: How many transistors will you have to enlarge?) (Chuck) Twelve. Twelve out of sixteen thousand. One in a thousand. The critical ones that generate the pulses.
Another effect we observe, we have some chips that we can in fact put four instructions per word. Those chips just happen to have the right parameters and execute four instructions per word just fine. But the browser just doesn't work.
In the browser we can pack twenty to fifty percent of the instructions before it fails. Why? The answer is that the browser does a lot of in-page memory accesses and those are memory accesses every forty nanoseconds. The test code does not have the same relationship with the video coprocessor and might make accesses every one hundred and fifty nanoseconds. So this says if you execute so many packed instructions in so many nanoseconds the temperature still hasn't gotten down to where things work right and eventually it fails. If you do more off-page access it is slower and the temperature has more time to drop.
We're dissipating very little energy. The three hundred milliamps that was mentioned was for the board. The chip is dissipating maybe ten milliamps. It is very hard to measure because it is in the noise. It is very hard to simulate because my simulator isn't accurate for those measurements. One of our challenges is to measure how much energy the chip uses. It is almost miraculously low.
One of the few reasons for doing this, as I have said before, the world does not want another microprocessor. We are competing with much better staffed and financed operations than ours. It is a fools game. Don't design your own microprocessor.
But we have a shot at it because we are very fast, we are very small, we are very low power and we are very cheap. If you are not all of those things you haven't a prayer. And since we are all of those things we have a tiny chance of getting anything into the market place.
(How small?) (Chuck) At .8u it is about 1 square mm. We can make it about sixteen times smaller in a different process. (questions about temp and other chips and spice etc)
(part 3 is Chuck's ideas about browsing the internet
not yet transcribed as of 11/29/98)
There is an older reference to Chuck's Color Forth at this site.