54 reading and writing binary data python cookbook pdf


Before you move on, it's worth verifying that Python is installed and working correctly. You can use the shortcuts that were added to the Windows Start Menu if you wish, but I recommend that you launch Python from the command prompt as this is how you will run the scripts you create later in this tutorial.

At the prompt, enter the command: See a text-only view of Figure 2. If you see a message such as python is not recognized as an internal or external command, operable program or batch file , the Python directory was not placed on your Windows Path. See Related topics for information on how to set this up.

To quit the Python prompt, enter the following command: You should return to the Windows command prompt after entering this command at the Python prompt. Navigate to the setuptools package page see Related topics and find the file for your version of Python, which is 2. When it is finished, you will be returned to the Windows command prompt, as in Figure 3.

See a text-only view of Figure 3. Open a Windows command prompt window and issue the python command to launch the Python interpreter. At the prompt, enter the following commands to connect to DB2 and count the number of rows in the country table. Also, be sure to replace the credentials in the code in Listing 2 with your actual DB2 credentials. After you enter the final line above, press enter and the code will execute. You should see a result Count: Also verify that your credentials for connecting to DB2 are correct.

With the database set up and Python ready to get to work, you are now ready to start developing the main subject of this tutorial.

In the next section, you will download, parse, and convert CSV data from the U. You will then learn how to read this data from the database and display it to the user. Before you start, you should create a folder somewhere on your hard disk where you will store the project files.

I stored my data in a folder C: The United States Census Bureau has a plethora of data available for download, in a variety of different formats. Unfortunately, the population data from Census and estimates for each year since then is only available in CSV format and not XML.

Instead, however, you will create a Python script to do this task. In your favorite text editor, create a new file and save it as download. Add the code from Listing 3 to this file. In this script, you use the httplib module to connect to the census. Then you fetch the response and write it to a file named data. To run this script, open up the Windows Command Prompt and change to the project directory as follows: Next, run the following command to run the Python script: When the script has completed you will return to the prompt.

You might wonder why there were no messages produced—don't worry, this is a good thing as it means no errors occurred. Open your project folder in Windows Explorer and you will now notice an extra file in the folder named data. If Excel asks you to save the file, choose No. If you accidentally save the file, simply delete it and re-run the download.

To convert the CSV data into XML, you must first be clear on how exactly you wish to store the data, whether different records should be stored differently, and check if some records should be discarded. In the example of the CSV file you just downloaded, you will notice that this contains three types of data: The first row of the file is a header row that is to be used for column names.

The script you create in this section will take the header row and use this data to form the tag names for each element that a record should have in the XML document. The script will determine, based on the first four columns, whether the particular row refers to a country, region, or state, and will set the tag name accordingly to indicate what the XML document refers to.

Finally, the script will choose to exclude the Puerto Rico Commonwealth record as it has some incomplete data. In your text editor, create a new file and save it as convert.

Add the code from Listing 4 to this file. In this file, you use the csv library to read the data. Then you loop through each line of the CSV file. If the current line is the first line of the file, you set that record as the header. This will be used later in the script as the element name for each field in a country, region, or state record. If the current line is not the header record, you loop through each column in the record, and create an inner XML element string whose name is driven from the heading record.

Finally, you check if the record contains an X in a specific field, and if so, set a Boolean indicator to True that will stop that particular row from being added to the XML document. The first way you can run this script is the same as before, by issuing: As you can see, the script has put the data directly to the screen.

It would be far more useful if this data were saved to a file. Rather than creating more Python code to do this, you can simply change the command you issue as follows to tell the command prompt to save the output to a file named data. This will create a new file in the project directory named data.

If you open this file in an application that reads and formats XML, such as Firefox, you might see a result like the one in Figure 7. Now, it is possible to use XQuery to split up this data and store it into separate rows. In the next section, you will modify the convert. Previously, you learned how to format the CSV data you downloaded from the U. Census Bureau into a large XML document. Now you will learn how to take the rows for country, regions, and states and insert them into a DB2 database.

Make the changes listed in this section to the convert. To do this, change the first line of the covert. With a script like this, running it multiple times causes every row to be inserted repeatedly, resulting in a mass of duplicate data. To prevent that, clear the database tables at the start of the script so that each time it runs it will start from fresh. You might remember the code to connect to the DB2 database from an earlier section of this tutorial, where you tested that the Python connection to DB2 was working.

This time, you are performing three SQL statements—deleting all data from the country, region, and state tables, respectively. In each case, Python will output a message either confirming that the statement executed successfully or that an error occurred. If an error does occur, the DB2 error message is relayed to the user, making it easier to debug what went wrong.

Next, you need to delete a couple of print statements that output the outer XML declaration for the single large document you created in the previous section.

Finally, you need to change the convert. You have already created code to determine if a particular line is a country, region, or state, and to generate the XML for the row; so all you need to do now is create the relevant INSERT statement and execute it. Find the line that currently reads print xml. You need to replace this line with the code from Listing 6.

Keep in mind that Python is very sensitive to code indentation, so be sure to line up your code correctly in your text editor. The final code for convert. Again, indentation is hugely significant in Python, so ensure that it is correct or you might experience unexpected results. Make sure you have saved this file and open a Windows command prompt. Change to the project directory and run the convert. You should see a number of "Row added to state table" messages appearing one after the other, as in Figure 8.

Make sure you are connected to the census database issue the command connect to census if required and enter the following SQL statement: This query should produce 51 results, as in Figure 9. Click on the more This should look similar to the screen capture in Figure Feel free to execute similar SQL statements to retrieve records from the country and region tables; you should get a single row result for the country table and four rows for the region table.

Next, you will learn how to read this data from DB2 into Python and present it to the user. In this section, you will learn how to build a command-line Python application that will request user input to select one of three menu options. These options will allow the user to view a list of states, regions, or countries ordered by the population driven from the census.

To start, you'll connect to the DB2 database, print the list of menu options, and request the user's input. Create a new file called read. The word bit is a shortening of the term bi nary digi t. One bit of memory holds a single binary digit. The byte is a grouping of eight bits. The eight-bit byte can conveniently store a single character. There are possible combinations of 8 bits.

A kilobyte KB is equal to bytes. A megabyte MB is equal to kilobytes, and bytes. A gigabyte GB is equal to 1 about billion bytes. A Terabyte is equal to 1 trillion bytes.

A petabyte is equal to 1 quadrillion bytes. The greatest order of magnitude in common use is the exabyte , which approximately equals 1 quintillion bytes. Nearly all computers use binary. The system fits neatly into the two states of a single bit memory: The computer can only read directions given in binary form, which makes binary the computers natural language.

The Binary Code is the syntax that a computer uses to handle data. For example, a computer understands "HI" as These codes are used to represent all characters that can appear in data such as numbers, letters, and special characters and symbols like the dollar sign, comma, percent symbol and many mathematical characters. It consists 1 to 4 bytes per character and is capable of representing over a million characters. Unicode is used by the majority of web related browser and applications with in browsers.

The greatest advantage of using Unicode is that it can be used across the world and maintain consistent results. Understanding Computers Today and Tomorrow. Processing and Memory, 2, page In this way, software programs must also be represented by 0s and 1s.

Machine language is also one of the binary code to convert the instruction before any program instruction is executed by computer. The computer uses a coding system to represent data computer language.