Naked Science Forum

Non Life Sciences => Geek Speak => Topic started by: i am bored on 21/01/2008 02:29:22

Title: 01001010100001010100010101.... a question about computer code
Post by: i am bored on 21/01/2008 02:29:22
how does it work, how does the computer translate what all those zeros, ones, dashes, and whatever may be in there means
Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 21/01/2008 03:02:42
A computer does not 'translate' anything - you, the human being do the translation.

All the computer does is receive voltages (ones and zeroes, being the presence or absence of a voltage) based on switched that you connect (every time you press a key on a keyboard you connect a switch beneath the key, and this sends a pattern of ones and zeroes to the computer).

Based on the ones and zeroes it receives, it makes a myriad of complicated decisions alone the lines of if this is 'one' and that is 'one' then the output should be 'zero' (all very simple and primitive decisions, but so many of them that it begins to look complicated).

Then, when it has made all this decisions based on the ones and zeroes you sent via the keyboard, it will then either print dots of colour on a piece of paper, or dots of light on a screen.  The computer has no idea what these dots mean, it just knows that using the same kind of logic it has been doing so far, it must either draw a dor or not draw a dot in a specific location.

It is then a human being who then looks at all those dots, and tries to make sense of whether it is a letter 'A' or a letter 'Z', and how the dots form into words, or anything else.

This is how computers have always worked, it is just that each new generation of computers has another layer of complexity in how it makes the decisions, so it looks to be more clever, but beneath its complexity, it is still a very stupid machine.
Title: 01001010100001010100010101.... a question about computer code
Post by: i am bored on 21/01/2008 03:19:28
ok but what about html code for website layouts and things of that nature... how does that work
Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 21/01/2008 04:45:41
ok but what about html code for website layouts and things of that nature... how does that work

It depends on what level you want to understand that.

At its most primitive level, it is no different to what I have described above, but at a higher level, we can see it from a different point of view, that makes it easier for us to understand.

The main difference with HTML is that the data is stored data rather than immediate data.  While, as I said before, when you press a key on a keyboard, you just get a stream of ones and zeroes; but now it does not get to the end of its logic, but half way through the logic it will store some of those ones and zeroes on to a long term storage medium, such as a hard disk (the fact that the hard disk may be at the other end of a network is merely a nuance, it is still just storing it to a long term storage of some kind).

Later on, you send some more ones and zeroes from the keyboard, but this time the logic of those ones and zeroes tells it to retrieve the information from the long term storage (which may be on the far end of a network), and by combining the ones and zeroes from storage with the recent ones and zeroes from your keyboard, it then makes further decisions, and then decides where to draw the dots on the screen or the paper.

Ofcourse, from the human point of view, we don't really want to edit ones and zeroes all the time (that is exactly what one did have to do in the old days, and there are still tools that would allow you to do it, and I have done it myself many years ago), so people have designed tools that allows the ones and zeroes to be displayed and edited in a more human friendly way (such as displaying and editing alphabetic codes that look like HTML). 

In the modern world, in order to make things easy for human beings, all computer programs have 3 phases involved in writing them.

The first phase is known as editing, for which you use an editor (a little like a word processor, but geared more towards making things easy for programmers rather than writers of books and memos).  What an editor does is simply takes the ones and zeroes from the keyboard and stores them directly onto a long term storage medium (typically a hard disk).  It also allows you to edit the files on the disk.  The files on the disk are generally a close correlation with the binary data that came from the keyboard (which means that on most machines, if you type the letter 'A' on the keyboard, it will be stored as binary '01000001' and the letter 'G' will be '01000111').  This disk file is commonly known as the source code.

The next phase is known as compilation.  This takes the keystrokes that were stored in the disk file and tries to convert them to machine instructions that will cause the computer to perform the desired logic.

In older machines (and in rare circumstances in modern programming) the data in the source code will have a close correlation with the machine instructions for the machine (this is known as assembly language programming).  In such cases, you may have a line in the source file that looks like:


This would be translated to the binary pattern:


and this will tell the computer to add the number seven to an area of storage known as the AX register.

(note that different computers can have different binary machine instructions, and so will have different assemblers, but the above is what you would expect on a common PC)

In modern computing, it is very rare that one programs at that level, and more likely you will see code such as:


This one line can generate at least 5 machine level instruction of the type I showed above.  This is what is known as a high level language (as distinct from assembly languages I talked about before).  Some high level languages can easily generate 20 or 30 instructions for one line of code - it depends on what you are trying to do, and the nature of the language you are using, and how well the compiler is designed.

HTML is also a high level language, but of a different kind.  HTML will have lines such as:


That one line tells the computer to draw dots on the screen that represent the text "This is a title", the dots should form the font, and in the location, appropriate for a level one heading.  This requires lots of logical decisions that the computer has to make, possibly combining the information with code such as:


Which means that text that is intended to be of a level one heading should have dots drawn to display a times roman font at 15 points size for the font.

After the source file is compiled, the next phase is the execution of the machine code that is generated from the source file.

In many modern computer languages the compilation and execution phase is rolled into one (this is what happens with HTML, where the browser performs the task of both compiler and execution).  In other cases (such as when you run Microsoft office), the compilation phase has taken place before the code was sent to you, it was then stored on a disk file, and then when you run the office program, you load the machine code from the disk file, and the computer then executes it without having to compile it.

Hope that is not all too much to take in.
Title: 01001010100001010100010101.... a question about computer code
Post by: JimBob on 21/01/2008 21:02:37
the translation from binary language to usable information is acomplished by "machine code". This is a very simple code which converts the 0's and 1's to bits and then reads these bits into something the computer can use. Each processor or processor type (for example all Athelon 3000 64 bit processors use the same code) has its own machine code that is hard coded into the  mother board, I think in the CMOS. Microprograms, such as COBAL, then translate this into something an HTML client (programme) can use)

Wiki can explain it better:
http://en.wikipedia.org/wiki/Machine_code

Instructions are patterns of bits with different patterns corresponding to different commands to the machine.

Every CPU model has its own machine code, or instruction set. Successor or derivative processor designs may completely include all the instructions of a predecessor and may add additional instructions. Some nearly completely compatible processor designs may have slightly different effects after similar instructions. Occasionally a successor processor design will discontinue or alter the meaning of a predecessor's instruction code, making migration of machine code between the two processors more difficult. Even if the same model of processor is used, two different systems may not run the same example of machine code if they differ in memory arrangement, operating system, or peripheral devices; the machine code has no embedded information about the configuration of the system.

A machine code instruction set may have all instructions of the same length, or may have variable-length instructions. How the patterns are organized depends largely on the specification of the machine code. Common to most is the division of one field (the opcode) which specifies the exact operation (for example "add"). Other fields may give the type of the operands, their location, or their value directly (operands contained in an instruction are called immediate). Some exotic instruction sets do not have an opcode field (such as Transport Triggered Architectures or the Forth virtual machine), only operand(s). Other instruction sets lack any operand fields, such as NOSCs[1].

ALSO see the into to computers link at the top of this Wiki page.

Title: 01001010100001010100010101.... a question about computer code
Post by: i am bored on 22/01/2008 02:14:35
ok i understand now
Title: 01001010100001010100010101.... a question about computer code
Post by: Nobody's Confidant on 22/01/2008 17:41:08
A computer does not 'translate' anything - you, the human being do the translation.

All the computer does is receive voltages (ones and zeroes, being the presence or absence of a voltage) based on switched that you connect (every time you press a key on a keyboard you connect a switch beneath the key, and this sends a pattern of ones and zeroes to the computer).

Based on the ones and zeroes it receives, it makes a myriad of complicated decisions alone the lines of if this is 'one' and that is 'one' then the output should be 'zero' (all very simple and primitive decisions, but so many of them that it begins to look complicated).

Then, when it has made all this decisions based on the ones and zeroes you sent via the keyboard, it will then either print dots of colour on a piece of paper, or dots of light on a screen.  The computer has no idea what these dots mean, it just knows that using the same kind of logic it has been doing so far, it must either draw a dor or not draw a dot in a specific location.

It is then a human being who then looks at all those dots, and tries to make sense of whether it is a letter 'A' or a letter 'Z', and how the dots form into words, or anything else.

This is how computers have always worked, it is just that each new generation of computers has another layer of complexity in how it makes the decisions, so it looks to be more clever, but beneath its complexity, it is still a very stupid machine.
Very interesting
Title: 01001010100001010100010101.... a question about computer code
Post by: DoctorBeaver on 23/01/2008 08:06:43
Just a wee pedantic note - HTML is classed as interpreted, not compiled (although you can get HTML compilers) and, technically, it is code not a language (even though HTML stands for HyperText Markup Language). Machine code should be called machine language. Each individual instruction is called machine code. I can't remember the exact definition of a computer language rather than a code, but I believe it has something to do with code not necessarily being acted on in a strict, procedural way. You may have noticed that different browsers will display different parts of the web page first. This is called rendering and its as if the browser is deciding which bits to show first. If computer languages were treated the same way, it would cause havoc.

HTML code still needs to be translated into machine code for the computer to execute, but it is done "on the fly" - a bit at a time - whereas with compilation, the whole of the source code needs to be compiled before the computer can act on it.

PHP is an example of an interpreted language. This website runs on php (which, originally, stood for Personal Home Page). It allows the user to interact with the website. Whereas HTML can only tell the computer what to display, php can accept input from the user, read from a database and perform operations on data. It is what is called a Server Side Include (SSI), or Pre-processor, meaning that all php operations are performed by the host computer before the data is sent to your PC over the net.

Javascript, on the other hand, works on your own PC even though the code originates on the server. This is known as a Client Side Include.
Title: 01001010100001010100010101.... a question about computer code
Post by: DoctorBeaver on 23/01/2008 08:29:25
But back to your original question.

Computers are based around what are called logic gates. Logic gates receive input and, depending on the input, will output something or not. There is a form of algebra called Boolean based on the same logic.

The operators are...

1) OR - output a voltage if either input A or input B has a voltage
2) XOR (eXclusive OR) - output a voltage if either input A or input B has a voltage BUT NOT BOTH.
3) AND - output a voltage if input A and input B both have a voltage.
4) NAND (Not AND) - output a voltage if there is NOT a voltage at both input A and input B. (this will output a voltage if A and B are zero).
5) NOR (Not OR) - output a voltage only if A and B are both zero.

Examples of Boolean algebra...

4-bit binary code for the number 1 is 0001, and 3 is 0011. Therefore (1 AND 3) = 1 as both have the rightmost bit set. 2 is 0010 so (1 AND 2) = 0 as the same bit is not set in both.

(1 OR 2) = 1 because there is a bit set (but not necessarily the same bit) in both.
Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 23/01/2008 16:10:35
Just a wee pedantic note - HTML is classed as interpreted, not compiled (although you can get HTML compilers) and, technically, it is code not a language (even though HTML stands for HyperText Markup Language).

Firstly, as you say later, it has to be translated to machine code to execute, thus in my book it is compiled (as all executable code must be compiled), but as I mentioned above, it is compiled at the stage of execution rather than as a separate phase.  This is what happens, at least to some degree, in all interpret languages (in fact, the modern tendency is not to draw a clear distinction between compiled and interpreted languages, since very few interpreters, usually only batch language scripts, do not have some sort of internal object code, and many other languages (such as Java or many forms of Pascal or Forth) only ever get compiled to a machine code for a virtual machine - but even what is a virtual machine and what is a real machine gets complicated when one looks at the microcode executed in some processors).

This compilation phase may be more obvious in languages like PERL, which, although interpreted, have a clearly defined internal object code, and are compiled in their entirety when they are loaded (i.e. just in time compilation).  HTML probably is never compiled in its entirety (although even this I am not convinced of, since HTML has tp resolve references to CSS, as well as working out the layout of each component depending on the size of other components, so the browser must create an overview of the entire document), and does not have a well defined virtual machine on which it runs (anyway, I thought I'd done enough without having to get into the differences between physical hardware and virtual machines).

As for it being a language, any structured information must be a language.  HTML is a structured language (it is not an algorithmic or procedural language, but then neither is SQL - which is a Structured Query Lanugage) - they are declarative languages, but still languages.

Machine code should be called machine language. Each individual instruction is called machine code. I can't remember the exact definition of a computer language rather than a code, but I believe it has something to do with code not necessarily being acted on in a strict, procedural way. You may have noticed that different browsers will display different parts of the web page first. This is called rendering and its as if the browser is deciding which bits to show first. If computer languages were treated the same way, it would cause havoc.

This is not so.  There is no guarantee that every compiler will render the same machine code for the same compiler language.  Even relatively simple languages (such as C) may be subject to optimisation that will mean different machine code is generated, as well as certain aspects of the language standards that are not always well defined and open to interpretation by the compiler writer.  For higher level languages, the choices the compiler writer has to make are even greater, so the variability of output is greater.  As for functional languages (such as Scheme or Haskell), they are not even procedural languages, so would need to be translated into procedures in order to be executed on the underlying hardware, so the coices to be made are even greater.

HTML code still needs to be translated into machine code for the computer to execute, but it is done "on the fly" - a bit at a time - whereas with compilation, the whole of the source code needs to be compiled before the computer can act on it.

As I said above, given the interdependence between the various component parts of an HTML source file (the CSS, and the contextual nature of the layout rendering), I would think the browser would need to have a total overview of the HTML code, rather than simply translate it one bit at a time.  You may get away with translating shell scripts of batch languages in that way, but declarative languages need to have a good overview of what you are doing (not that I have ever worked on, or taken apart, a browser to know first hand how it does work - just thinking about how I might address some of the issues involved if I had to).

PHP is an example of an interpreted language. This website runs on php (which, originally, stood for Personal Home Page). It allows the user to interact with the website. Whereas HTML can only tell the computer what to display, php can accept input from the user, read from a database and perform operations on data. It is what is called a Server Side Include (SSI), or Pre-processor, meaning that all php operations are performed by the host computer before the data is sent to your PC over the net.

Again, not strictly correct.

PHP (in its usual guise, on a web server - before someone starts quoting standalone PHP to me) cannot interact with the user, because it does not function on the end users machine.  PHP only interacts with the browser, and the server (e.g. server databases).

HTML does allow input, although (in the absence of JavaScript - which is nowadays somewhat rare) cannot apply much intelligence to that input (although it can render a 'submit' button that, if clicked, will tell the browser to send the contents of the form to the server, on which you will probably have something like PHP running to interpret the contents of the form).  It can also render a 'reset' button that causes the data to be reset.  All other user input is merely stored and forwarded (when the page is submitted) without any action being invoked on the client machine.

Nonetheless, the acceptance of user input itself is down to the browser (since that is all that is running on the user machine) as instructed by the HTML code it is executing.
Title: 01001010100001010100010101.... a question about computer code
Post by: lyner on 23/01/2008 22:26:08
There is an essential difference between a compiled language and an interpreted language. A compiler produces a complete low level code of the whole programme which is executed by the computer. The file is in binary,  machine-only readable. In an interpreted language, the program is presented to the computer in its text form. The computer interprets (decodes) each line of the written program every time it comes to it during execution.
Compiling a program makes it run many times faster because the original text input does not need to be 'parsed' constantly and can be optimised to eliminate wasted steps.
Basic was (is) an interpreted language but modern Basics can be compiled and they will run faster.
Title: 01001010100001010100010101.... a question about computer code
Post by: DoctorBeaver on 25/01/2008 08:14:20
Just a wee pedantic note - HTML is classed as interpreted, not compiled (although you can get HTML compilers) and, technically, it is code not a language (even though HTML stands for HyperText Markup Language).

Firstly, as you say later, it has to be translated to machine code to execute, thus in my book it is compiled (as all executable code must be compiled), but as I mentioned above, it is compiled at the stage of execution rather than as a separate phase.  This is what happens, at least to some degree, in all interpret languages (in fact, the modern tendency is not to draw a clear distinction between compiled and interpreted languages, since very few interpreters, usually only batch language scripts, do not have some sort of internal object code, and many other languages (such as Java or many forms of Pascal or Forth) only ever get compiled to a machine code for a virtual machine - but even what is a virtual machine and what is a real machine gets complicated when one looks at the microcode executed in some processors). This compilation phase may be more obvious in languages like PERL, which, although interpreted, have a clearly defined internal object code, and are compiled in their entirety when they are loaded (i.e. just in time compilation).  HTML probably is never compiled in its entirety (although even this I am not convinced of, since HTML has tp resolve references to CSS, as well as working out the layout of each component depending on the size of other components, so the browser must create an overview of the entire document), and does not have a well defined virtual machine on which it runs (anyway, I thought I'd done enough without having to get into the differences between physical hardware and virtual machines).

The distinction between interpreted and compiled is that an interpreter acts directly from source code. With compiled languages, it is the compiled version that is presented to the computer. Yes, technically the computer does compile an interpreted language at runtime, but the source code doesn't go through a separate, external process.

Quote

As for it being a language, any structured information must be a language.  HTML is a structured language (it is not an algorithmic or procedural language, but then neither is SQL - which is a Structured Query Lanugage) - they are declarative languages, but still languages author=DoctorBeaver link=topic=12522.msg153275#msg153275 date=1201075603]
Machine code should be called machine language. Each individual instruction is called machine code. I can't remember the exact definition of a computer language rather than a code, but I believe it has something to do with code not necessarily being acted on in a strict, procedural way. You may have noticed that different browsers will display different parts of the web page first. This is called rendering and its as if the browser is deciding which bits to show first. If computer languages were treated the same way, it would cause havoc.

Quote
This is not so.  There is no guarantee that every compiler will render the same machine code for the same compiler language.  Even relatively simple languages (such as C) may be subject to optimisation that will mean different machine code is generated, as well as certain aspects of the language standards that are not always well defined and open to interpretation by the compiler writer.  For higher level languages, the choices the compiler writer has to make are even greater, so the variability of output is greater.  As for functional languages (such as Scheme or Haskell), they are not even procedural languages, so would need to be translated into procedures in order to be executed on the underlying hardware, so the coices to be made are even greater.

It was many years ago that I did my computer science degree & I can't remember the exact definition of a computer language. What I do remember, though, is referring to the machine language of an IBM 360 as code & being berated..

HTML code still needs to be translated into machine code for the computer to execute, but it is done "on the fly" - a bit at a time - whereas with compilation, the whole of the source code needs to be compiled before the computer can act on it.

As I said above, given the interdependence between the various component parts of an HTML source file (the CSS, and the contextual nature of the layout rendering), I would think the browser would need to have a total overview of the HTML code, rather than simply translate it one bit at a time.  You may get away with translating shell scripts of batch languages in that way, but declarative languages need to have a good overview of what you are doing (not that I have ever worked on, or taken apart, a browser to know first hand how it does work - just thinking about how I might address some of the issues involved if I had to).

I was referring to the order in which browsers display the components of the page. Not all display the components in the same order. If, say, a banking system did the same we would be in serious trouble.

PHP is an example of an interpreted language. This website runs on php (which, originally, stood for Personal Home Page). It allows the user to interact with the website. Whereas HTML can only tell the computer what to display, php can accept input from the user, read from a database and perform operations on data. It is what is called a Server Side Include (SSI), or Pre-processor, meaning that all php operations are performed by the host computer before the data is sent to your PC over the net.


Again, not strictly correct.

PHP (in its usual guise, on a web server - before someone starts quoting standalone PHP to me) cannot interact with the user, because it does not function on the end users machine.  PHP only interacts with the browser, and the server (e.g. server databases).

That's exactly what I said. Php does its bit & then the result is sent to the end user.


Quote
HTML does allow input, although (in the absence of JavaScript - which is nowadays somewhat rare) cannot apply much intelligence to that input (although it can render a 'submit' button that, if clicked, will tell the browser to send the contents of the form to the server, on which you will probably have something like PHP running to interpret the contents of the form).  It can also render a 'reset' button that causes the data to be reset.  All other user input is merely stored and forwarded (when the page is submitted) without any action being invoked on the client machine.

Nonetheless, the acceptance of user input itself is down to the browser (since that is all that is running on the user machine) as instructed by the HTML code it is executing.


Admittedly, HTML can accept form input, but it can't do anything with it. There are no facilities for acting upon input data so all HTML can do is pass the input to a language that can deal with it.

Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 25/01/2008 09:18:39
Yes, technically the computer does compile an interpreted language at runtime, but the source code doesn't go through a separate, external process.

That is what I said earlier.

In many modern computer languages the compilation and execution phase is rolled into one (this is what happens with HTML, where the browser performs the task of both compiler and execution).  In other cases (such as when you run Microsoft office), the compilation phase has taken place before the code was sent to you, it was then stored on a disk file, and then when you run the office program, you load the machine code from the disk file, and the computer then executes it without having to compile it.

Yes, I could have mentioned that where the compile and execution is rolled into one step, it is regarded as an interpretive language, but I thought I'd thrown up enough jargon without adding more.

It was many years ago that I did my computer science degree & I can't remember the exact definition of a computer language. What I do remember, though, is referring to the machine language of an IBM 360 as code & being berated..

I think the difference between 'code' and a 'language', is that 'code' refers to a specific set of instructions, while 'language' refers to the rules by which that 'code' is understood.  Thus you would refer to a program as a piece of code, but it is written according to a set of rules that defines the language.

Since we rarely write 'machine code' (in fact, arguably, we cannot write machine code, we can only load machine code directly into memory), thus we usually refer to that which we do write, which is 'assembly code', which is written in an 'assembly language'.

I was referring to the order in which browsers display the components of the page. Not all display the components in the same order. If, say, a banking system did the same we would be in serious trouble.

Not sure what you mean by this?

There are other languages that are not procedural, and banks use them extensively (SQL being one that comes to mind).  The only thing that matters is that for a given piece of code there is a predictable and repeatable outcome.  If this is achieved, in which order it happens does not matter (and even in many procedural languages, you cannot guarantee the order in which instructions are executed, since optimisers are free to reorder instructions if the optimiser can guarantee the same outcome but in a more optimal fashion).

Ofcourse, from the users perspective, things must happen always in the same order – but that is the case even as far as browsers are concerned, insofar as the order in which the browser may render the page does not effect the key issues that relate to interaction between the user and the system.

That's exactly what I said. Php does its bit & then the result is sent to the end user.

No, you are missing the point entirely – all PHP does is send the output to the clients machine, but it has no way of knowing if the output is displayed to the user or not.

This is not merely being picky, because PHP can send any data, not merely HTML.  I often use PHP to send images and Javascript, and increasingly Javascript running on the browser can intelligently interact with web pages (more commonly XML rather than HTML) generated on the server (such as a PHP script sending data to the Javascipt to modify the Javascipt's behaviour).


Admittedly, HTML can accept form input, but it can't do anything with it. There are no facilities for acting upon input data so all HTML can do is pass the input to a language that can deal with it.

As I said before, to a large extent this is true, but not totally so.

You have two buttons that may be rendered in HTML, the 'Submit' and the 'reset' button.  If these buttons are not intercepted by Javascript, then they will cause the form to be either sent to the server, or to be reset to their original value.  You can even have multiple forms on the page, and depending on which 'Submit' button you press, you can send data from different forms.

There are also various other bits of simple intelligence that HTML has (such as managing radio buttons and drop down selections).

If one includes CSS, there is even more intelligence in the way the interface in managed.

None of this is data processing in any real sense, it is merely managing the user interface, but it is not totally dumb either.

The whole point about the Internet is the separation of user interface intelligence from data processing intelligence – hence why you need two separate languages (actually, somewhat more than that, but nonetheless two domains where different languages are used) that are specialised at doing their different tasks.
Title: 01001010100001010100010101.... a question about computer code
Post by: Nobody's Confidant on 25/01/2008 17:34:41
    I finally know how those verification messages work, you know the one's with the messed up letters? Computers only read presence and absence of dots, they don't know what they mean. The lines going through the letters mess up how the computer looks at it. Brilliant.
Title: 01001010100001010100010101.... a question about computer code
Post by: DoctorBeaver on 26/01/2008 20:32:28

Quote
Quote
Quote from: DoctorBeaver on 25/01/2008 07:14:20
I was referring to the order in which browsers display the components of the page. Not all display the components in the same order. If, say, a banking system did the same we would be in serious trouble.

Not sure what you mean by this?

What I mean is that different browsers may display the components of a web page in a different order. For instance, browser A may display all the graphics first, browser B may display components from top to bottom regardless of their being text or graphic.

Imagine writing a banking system. Part of that system may need to calculate your interest, add it and then deduct tax. You then move the system from a Unix platform to Windows and find that now it is calculating your interest, deducting tax from the balance and THEN adding the interest. You'd be pleased, but the Chancellor would go mental!

Or what if Google maps printed out the directions in a different order on a Mac?
Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 26/01/2008 22:35:26
Imagine writing a banking system. Part of that system may need to calculate your interest, add it and then deduct tax. You then move the system from a Unix platform to Windows and find that now it is calculating your interest, deducting tax from the balance and THEN adding the interest. You'd be pleased, but the Chancellor would go mental!

But you are missing what I was saying.

If you have system A that calculates 5% interest, and then deducts 40% tax from the total, or whether system B first deducts 42% tax, and then adds 8.6207% interest tax free.  The point is, it does not matter in which sequence the calculations are made, so long as the outcome of the calculations is the same.  It is not the order of what is done that matters, but what is the outcome of what is done.  For someone with £100 in the bank, the chancellor wants £42, and the account holder expects to have £63 left in his account - how you do it does not matter, so long as the outcome is the same.  Yes, the exact numbers you need to deal with change according to the order in which you do things, but the main issue is that your target is the end result, so it matters not what order a browser does things in, so long as it ensures the final outcome is that which is desired.
Title: 01001010100001010100010101.... a question about computer code
Post by: DoctorBeaver on 27/01/2008 00:01:22
That's not what I meant. Maybe I didn't make myself clear. I was trying to demonstrate how you could get 2 different answers by doing the same calculations but in a different order.

In my first example, interest is added and then the tax is worked out on the total. In the 2nd example, the interest is calculated (it would be the same) but the tax is calculated BEFORE the interest is added.

To put figures to it...
Balance = 100, interest = 10%, Tax = 10%

Ex 1. 100 + 10(interest) = 110 * 10%(tax) = 11, so the chancellor gets £11 and you end up with £99.

Ex 2. 100 * 10%(tax) = £10 tax, so the chancellor gets £10. THEN the original interest of £10 is added. You end up with £100 instead of £99, and the chancellor gets £10 instead of £11.

As HTML is only displaying things and not doing any calculations, it matters not what order it does things in; you always get the whole page displayed and it looks the same no matter what browser or platform you use (OK, IE may not render exactly the same as FF or Opera, but the differences are not critical). But a program that does calculations has to do them in the same order no matter what platform it's running on.
Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 27/01/2008 01:38:38
That's not what I meant. Maybe I didn't make myself clear. I was trying to demonstrate how you could get 2 different answers by doing the same calculations but in a different order.

In my first example, interest is added and then the tax is worked out on the total. In the 2nd example, the interest is calculated (it would be the same) but the tax is calculated BEFORE the interest is added.

To put figures to it...
Balance = 100, interest = 10%, Tax = 10%

Ex 1. 100 + 10(interest) = 110 * 10%(tax) = 11, so the chancellor gets £11 and you end up with £99.

Ex 2. 100 * 10%(tax) = £10 tax, so the chancellor gets £10. THEN the original interest of £10 is added. You end up with £100 instead of £99, and the chancellor gets £10 instead of £11.

As HTML is only displaying things and not doing any calculations, it matters not what order it does things in; you always get the whole page displayed and it looks the same no matter what browser or platform you use (OK, IE may not render exactly the same as FF or Opera, but the differences are not critical). But a program that does calculations has to do them in the same order no matter what platform it's running on.

But that is not quite true of HTML either.

It is certainly true that the order you draw things in does not matter, but the actual rendering of the image is not where most of the work is.  What very much does matter is the laying out of the image, and it is important that you lay out the image in the proper order (i.e. you cannot know where one component goes until you have worked out the size of the components above, and to the left (and in some cases, also to the right, below, or its parent component) are.  There are also rules about inheritance of characteristics (this is more critical when you deal with CSS).

Ofcourse, there are subtle differences between the way IE and Opera and Mozilla deal with certain layout issues, and these are the bane of any web designers life, but this is no different from trying to port code from a machine with native 16 bit architecture to 32 bit architecture, or moving from high byte integer ordering to low byte integer ordering, or moving from a machine that has cr/lf line termination to just lf line termination.  You code around such issues, and make sure that where your program is dependent upon one way of doing things or another, it becomes aware of the context it is running in and changes its behaviour to allow for it.  Just moving from one dialect of SQL to another (e.g. moving from MySQL to DB2 to Oracle) can cause subtle headaches.
Title: 01001010100001010100010101.... a question about computer code
Post by: DoctorBeaver on 27/01/2008 09:02:44
George, I fully understand what you're saying; but my point is that in some circumstances it is crucial that everything is done in the expected order. Yes, HTML has to work out where to put what, but it doesn't always do that first. I've seen many sites where the positioning of an element changes as more of the page appears. That is particularly true of sites that use relative, rather than absolute, positioning.

Would it really matter that much if, when you want to post a reply here, a particular browser displayed the smileys above the text box before it displayed the banner? No, it wouldn't. But in my interest & tax example, it does matter if tax is deducted before the interest is added.
Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 27/01/2008 12:18:02
George, I fully understand what you're saying; but my point is that in some circumstances it is crucial that everything is done in the expected order. Yes, HTML has to work out where to put what, but it doesn't always do that first. I've seen many sites where the positioning of an element changes as more of the page appears. That is particularly true of sites that use relative, rather than absolute, positioning.

Would it really matter that much if, when you want to post a reply here, a particular browser displayed the smileys above the text box before it displayed the banner? No, it wouldn't. But in my interest & tax example, it does matter if tax is deducted before the interest is added.

The point I was making is that there is no difference in the notion of correctness, and what it means, whether you are writing a browser or writing a banking system.

If you are trying to say that the consequences of a slight degree of incorrectness in a browser are less severe than a slight degree of incorrectness in a flight management system controlling an airliner with the lives of 300 people at stake, and the cruciality of correctness for a banking system lies somewhere in-between, then I would agree with that totally.

The order in which things happen, in any of the cases, is not the issue - what is the issue is the the final outcome must be as if the specified order was adhered to.

Bear in mind also that the browser is not only a display agent, but it is an interface between the user display and the network, and the network does require serialisation of data, so in that context doing things in the right order is even more critical (although failure to do so is neither life threatening nor financially ruinous - it merely causes you to fail to access the network with the browser), because the server at the other end expects data (particularly protocol data) in a particular order.
Title: 01001010100001010100010101.... a question about computer code
Post by: DoctorBeaver on 27/01/2008 14:02:06
Quote
The order in which things happen, in any of the cases, is not the issue - what is the issue is the the final outcome must be as if the specified order was adhered to.

In the example I gave, the order that things happen is precisely the issue as the outcome is affected. If a browser displays component B before component A, the final outcome is not affected. I could write a browser that displayed a page starting from the bottom, but when the whole page was displayed, it would look identical to if the browser had started displaying from the top, the left, the right, or the middle outwards.

But, in my example, if tax is deducted before interest is added then the outcome is different. Calculations must be performed in the order:-

calculate interest on current balance
add interest to balance
calculate tax on current balance
deduct tax from balance

Any other order gives a wrong outcome.


Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 27/01/2008 22:19:16
Quote
The order in which things happen, in any of the cases, is not the issue - what is the issue is the the final outcome must be as if the specified order was adhered to.

In the example I gave, the order that things happen is precisely the issue as the outcome is affected. If a browser displays component B before component A, the final outcome is not affected. I could write a browser that displayed a page starting from the bottom, but when the whole page was displayed, it would look identical to if the browser had started displaying from the top, the left, the right, or the middle outwards.

But, in my example, if tax is deducted before interest is added then the outcome is different. Calculations must be performed in the order:-

calculate interest on current balance
add interest to balance
calculate tax on current balance
deduct tax from balance

Any other order gives a wrong outcome.

But that is exactly why I gave the counter example.

You can reverse the order of the calculation.  You have to make adjustments to the numbers in order to reverse the order (i.e. raise the tax and the interest rates) if you are going to reverse the order, but you can still make the outcome the same.

Yes, if you are going to say you must first add X amount of interest, and then deduct Y amount of tax; then ofcourse reversing the order while retaining the same numbers will lead to a wrong outcome; but the generality of adding interest first and then deducting tax is not sacrosanct, it is only that the numbers have to be modified to do it in another order.

As for the display of the HTML elements - the physical display of the pixels is not the major issue, as I have said before.  Ofcourse, the actual display of the pixels can be in any order, it can be from the middle out, it can be every second pixel (some graphics images are actually encoded deliberately so they will display every second line before filling in the ones in the middle, so that the display of the image does not appear from top to bottom but appears as a rough image with the details slowly filled in, which some people think looks nicer for the end user).  The difficult task for the browser, as I said earlier, is not the display of the elements, but the laying out of the elements on the virtual screen before anything is displayed.  The order of the layout is not visible to the user, since nothing is shown until the browser knows where to show everything, but there are interdependencies within the layout that must be adhered to.

In processing terms, the actual amount of processing that goes on in making sure all of the elements sent to a browser will fit correctly, that the table element heights are correctly positioned, that elements that are meant to be centred within other elements are indeed centred within those parent elements, while elements that are left aligned are done so correcty, while long lines are folded where they are allowed to and not folded where they are not supposed to, that elements that are meant to be drawn behind other elements are not visible in those parts they are meant to be obscured; all of this far exceeds the processing complexity of adding some interest and taking away some taxation.  Ofcourse, in the world of financial systems, where you start working in complex derivative products, there can be some far more complex mathematical calculations one has to do, but this is not the kind of accountancy product you are referring to.
Title: 01001010100001010100010101.... a question about computer code
Post by: DoctorBeaver on 27/01/2008 22:57:47
Quote
Yes, if you are going to say you must first add X amount of interest, and then deduct Y amount of tax; then ofcourse reversing the order while retaining the same numbers will lead to a wrong outcome; but the generality of adding interest first and then deducting tax is not sacrosanct, it is only that the numbers have to be modified to do it in another order.

George, I fully appreciate the point you're making, but I think you are still misunderstanding me.

If you write code something along the lines of...

$interest = $balance * 0.1;
$balance += $interest;
$tax = $balance * 0.1;
$balance -= $tax;

you would expect the code to execute in that order no matter what platform (or even which language) you chose.

If you then ported it to a different platform, you would not expect it to suddenly run as...

$interest = $balance * 0.1;
$tax = $balance * 0.1;
$balance += $interest;
$balance -= $tax;

Not only would you not expect it, it would cause havoc if that happened.
Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 28/01/2008 00:02:44
Quote
Yes, if you are going to say you must first add X amount of interest, and then deduct Y amount of tax; then ofcourse reversing the order while retaining the same numbers will lead to a wrong outcome; but the generality of adding interest first and then deducting tax is not sacrosanct, it is only that the numbers have to be modified to do it in another order.

George, I fully appreciate the point you're making, but I think you are still misunderstanding me.

If you write code something along the lines of...

$interest = $balance * 0.1;
$balance += $interest;
$tax = $balance * 0.1;
$balance -= $tax;

you would expect the code to execute in that order no matter what platform (or even which language) you chose.

If you then ported it to a different platform, you would not expect it to suddenly run as...

$interest = $balance * 0.1;
$tax = $balance * 0.1;
$balance += $interest;
$balance -= $tax;

Not only would you not expect it, it would cause havoc if that happened.

What I would actually expect any decent compiler to do with that is:

Code: [Select]
$interest = $balance * 0.1
$a = $balance + $interest
$tax = $a * 0.1
$balance = $a - $tax

The $a value would probably never actually be stored in memory, but simply remain in a register.  Even $interest would not be stored be until after it has been added to $balance (which would still be in a register).

Ofcourse, the same thing could also be written (and I would have no problem if a compiler chose to do it that way, although I would think it the less likely solution):

Code: [Select]
$interest = $balance * 0.1
$a = $balance + $interest
$balance = $a * 0.9
$tax = $a - $balance


Ofcourse, it is even conceivable that a compiler might rewrite that as:

Code: [Select]
$i = 0.1
$interest = $balance * $i
$a = $balance * 1.1
$balance = $a * 0.9
$tax = $a * $i

The advantage with this way of doing things this way is that the dependency between calculating tax and the final balance is broken, as is the interdependency between calculating the value to be stored in $interest and the value to be retained in a register as $a.   In breaking these dependencies, it allows the processor, if it is capable of doing so, to perform these calculation in parallel, or at least overlap the calculations.

$i, like $a would be a register variable, so would never be written to memory.
Title: 01001010100001010100010101.... a question about computer code
Post by: DoctorBeaver on 28/01/2008 08:18:47
Whichever way it is done, the calculations will still have to be done in the correct order or you will get a wrong answer.
Title: 01001010100001010100010101.... a question about computer code
Post by: another_someone on 28/01/2008 13:11:29
Whichever way it is done, the calculations will still have to be done in the correct order or you will get a wrong answer.

The 'correct' order, yes - but in the last example, some of the instructions could be be reversed and still be correct, because some of the dependencies have been broken by the way it was rewritten.

Looking at your original code:

Code: [Select]
1) $interest = $balance * 0.1;
2) $balance += $interest;
3) $tax = $balance * 0.1;
4) $balance -= $tax;

It could be converted to assembly level code, assuming an abstract machine with at least 5 orthogonal registers, and 3 operand instructions (granted, it is more common for machines to have two operand instructions, but the distinction is not significant for the example to be relevant), the compiler could code your example as:

Code: [Select]
mov r1, $balance   ; [stmt 1 & 2] r1 = balance       
mov r2, 0.1        ; [stmt 1 & 3] r2 = 0.1           
mul r1, r2, r0     ; [stmt 1]     r0 = balance * 0.1
sto r0, $interest  ; [stmt 1]     interest = r0       
add r0, r1, r0     ; [stmt 2]     r0 ($a) = r0 (interest) + r1 (balance)
mul r0, r2, r3     ; [stmt 3]     r3 (tax) = r0 ($a) * r2 (0.1)
sto r3, $tax       ; [stmt 3]     tax = r3                     
sub r0, r3, r3     ; [stmt 4]     r3 (balance) = r0 ($a) - r3 (tax)
sto r3, $balance   ; [stmt 4]     $balance = r3

Which is fairly true to your original code, but it could also code it as:

Code: [Select]
mov r2, 0.1           ; [stmt 1, 3] r2 = 0.1
mov r3, 1.0           ; [stmt 2, 4] r3 = 1.0
mov r1, $balance      ; [stmt 1, 2] r1 = balance
sub r3, r2, r4        ; [stmt 2]    r4 = 1.1
mul r1, r2, r0        ; [stmt 1]    r0 (interest) = r1 (balance) * r2 (0.1)
add r3, r2, r5        ; [stmt 4]    r5 = 0.9
sto r0, $interest     ; [stmt 1]    interest = r0
mul r1, r4, r1        ; [stmt 2]    r1 ($a) = r1 (balance) * r4 (1.1)
mul r1, r5, r0        ; [stmt 4]    r0 (balance) = r1 ($a) * r5 (0.9)
mul r1, r4, r1        ; [stmt 3]    r1 ($tax) = r1 ($a) * r4 (0.1)
sto r0, $balance      ; [stmt 4]    $balance = r0
sto r1, $tax          ; [stmt 3]    $tax = r1

Which interleaves all of the instructions.  This interleaving is not merely arbitrary, since many processors can overlap operations where it sees the result of the previous operation is not required as an input to the next operation.

What you are saying is that any instruction ordering has to honour dependencies in their execuation - which I don't disagree with.  All I am saying is that this has nothing to do with whether it is a banking system or not.  All applications (whether it be a browser trying to work out how to display HTML, or a banking system) has some dependencies, while other bits of code are not interdependent and can safely be reordered by the compiler (and neither the compiler nor the processor is under any obligation to honour your suggested order if it finds it does not violate any interdependencies when the instructions are reordered).
Title: 01001010100001010100010101.... a question about computer code
Post by: lyner on 28/01/2008 17:09:08
Some operations are commutative and some operations are not.
Isn't it as simple as that?
If what you are doing can be written down as a simple set of algebraic expressions and there can be re-written / re-arranged, keeping to the rules,  then you will get the same answer. If the  operations don't follow the rules then you won't.
Any compiler worth its salt will look at the code you have written and re-write it, on your behalf, to make it go as fast as possible - changing the order of things and condensing two or more operations into one, where it can.  Try writing you own benchmark programs and you may find that it takes no longer to do an operation 100 times than it does to do it once; the compiler spotted the redundancy and only ran the routine once.
No interpreter can do that because it only looks at one line at a time.