My Python Programming

Saturday, January 5, 2013

Cloning lists including multidimensional lists...

Note this from the Python documentation.

If you just use the assignment function you don't really get a copy but a set of pointers to data that is changed as you change the original anyway.
This excellent summary explains why it is better to use the list() function rather than using [:] for example but this only seems to work for one dimensional arrays?

"...Shallow copies of dictionaries can be made using dict.copy(), and of lists by assigning a slice of the entire list, for example, copied_list =original_list[:]..."

The slice function also needs to be modified for two dimensional arrays as well. Suggestions including importing numpy or copy to use deepcopy() etc but this seems overkill...

See this post on how to clone 2d lists

Creating an A x B list to use as a matrix in Python

Creating a list that can be used as an A x B matrix in Python is not as straightforward as creating an empty list and then trying to append or insert data to it in the place at the co-ordinates you want. You would end up getting index errors etc.

Looking at the post on matrix formation here it looks like one method is to start by appending the number of items you need for, say, each column and then taking each column and appending the number of items you need to make up the rows.

This code shows how:

# create a 2D matrix of zeros and populate it
def make_list(size):
"""create a list of size number of zeros"""
mylist = []
for i in range(size):
mylist.append(0)
return mylist
def make_matrix(rows, cols):
"""
create a 2D matrix as a list of rows number of lists
where the lists are cols in size
resulting matrix contains zeros
"""
matrix = []
for i in range(rows):
matrix.append(make_list(cols))
return matrix
mx = make_matrix(3, 3)
print(mx)
print('-'*34)
# now populate the zero matrix
# for instance put a 5 in row 0, column 0
mx[0][0] = 5
# put a 7 in row 1, column 1
mx[1][1] = 7
# put a 9 in row 2, column 2
mx[2][2] = 9
print(mx)

The * function helps in filling up a list with items, but if you nest it you end up then creating copies of the SAME list again. This output shows how a nested * ends up working - i.e. you try and change one part of the matrix but instead it is changed in multiple columns (strangely without an index out of range error??)

mx = [[None]*3]*3
print(mx)
mx[1][1] = 7
print(mx)
"""
my output -->
looks okay
[[None, None, None], [None, None, None], [None, None, None]]
oops!!!!
[[None, 7, None], [None, 7, None], [None, 7, None]]
"""

However the best way seems to be to use the list function itself as shown here:

list(list(0 for i in range(3)) for j in range(3))

Remember you end up with a list of lists in Python that is referable like a basic array that you are used to.

Wednesday, November 7, 2012

Write your prime numbers to primes_file.txt

This program was to experiment with writing to file. It uses the last file set of PrimesTools.

First it works out the primes below 20, then uses that list as a parameter in the program to add on primes to an existing list and then finally writes to new file to the file.

What I learnt

using the write() method to write a string and a newline character to a file.
using the open() method to open a file - but more study of options needed
about file objects, returned when you open() a file I recall

What I need to do...

Check up on how the paths work and how to check about overwriting files etc
Probably my iteration could improve - maybe I dont need to use range() all the time
Check about how to examine properties of the file object created on opening the file.
Improve things by allowing proper input and functions call etc
Make this into a function itself to help extend my list of prime numbers.
Check if running in the IDE is slower than actually running off the python operating line...
Other options on the r

import PrimesTools

myPrimes = PrimesTools.listOfPrimes(20)
print("length of myprimes", len(myPrimes))

primesToWrite = PrimesTools.boundedListOfPrimes(myPrimes,100)
primesFile = open ("primes_file.txt","a")
for w in range(0,len(primesToWrite)):
primesFile.write(str(primesToWrite[w])+"\n")
primesFile.close()

My primes functions

A few primes functions

Here are a couple of programs for working with primes. The first is not so useful - just returning a list of all primes up to the parameter passed.

The next one you pass a list of knownPrimes and a new upper bound and primes are added onto the list returning an expanded knownPrimes at the end

The final one opens a file, checks that there are no composite numbers included by sorting the list and then checking that none of the later numbers are composites divisible by the earlier numbers. It also checks that the number of primes in the file is in the range that you would expect.

So, we are missing a function to actually write the new list of primes onto disk...

What I learnt
How to open a file on the disk and also a common way to read in line by line. For each line in the file open up the line and then use the strip() method to take out end of line keys and similar out of the string before converting into an integer.

listOfPrimes(n) - primes from 2 to n

def listOfPrimes(n):
"""supply a number, Returns a list of all the prime numbers from 2 up to parameter n"""

myPrimes = [2]
print(myPrimes[0])
isAPrime = None #define is a prime - probably not needed

for i in range(3,n+1): #up to n+1 in order to include n in range
for j in range(len(myPrimes)):
if i % myPrimes[j] == 0:
isAPrime = False
break
else:
isAPrime = True
if(isAPrime):
myPrimes.append(i)
print(myPrimes[j+1])

return myPrimes

boundedListOfPrimes(knownPrimes, n) - adds primes up to n to an existing list of primes

def boundedListOfPrimes(knownPrimes,n):
"""supply a list of known primes and an upper bound

works out the primes between the highest of knownPrimes and the upper bound.

appends the newly found primes to the list

in order to work the knownPrimes list must be complete from 2 onwards. """

if len(knownPrimes) == 0:
knownPrimes[0] = 2 #special case of first time to ever work out list set first value as 2

isAPrime = None #initial declaration of IsAPrime - start assuming isAPrime is true - i.e. numbers are all prime
highestKnownPrime = knownPrimes[len(knownPrimes)-1] #pick last one in the list
print("highest known prime",highestKnownPrime)


for i in range(highestKnownPrime,n+1): #n+1start testing from one more than last no in known primes
for j in range(len(knownPrimes)):
if i % knownPrimes[j] == 0:
isAPrime = False
break
else:
isAPrime = True
if(isAPrime):
knownPrimes.append(i)
#print(j+1,knownPrimes[j+1])

return knownPrimes


checkPrimesFile(primesFileName) - opens a file of primes and carries out some checks, returns the file

def checkPrimesFile(primesFileName):
"""purpose is to return an int list of numbers in a file after doing some checks that there
are no composite numbers in the file


asks for hte name of a primes file then reads in a text file of prime numbers and puts into an array
checks that all numbers in the file are primes by sorting it to the low to high and
then starting from the lowest testing the lowest against all the highest to make sure
nothing missing - then finally checks the theoretical distribution of primes against
the number.

Function has to be passed a null string or a name of a file. If it is a null string then
the function will ask for the name

"""

from math import log

if primesFileName == "":
primesFileName = input("Type the name of the Primes file")
print("Workign on ",primesFileName)

primesFileData = open(primesFileName)
myPrimes=[]

for line in primesFileData:
myPrimes.append(int(str(line).strip())) #make sure an integer and the last number is stripped

primesFileData.close()

myPrimes.sort() #make sure that the lowest number is first
includesAComposite = False #assume that file is valid to start with

for i in range(0,len(myPrimes)): #for every prime in the list
for j in range (i + 1,len(myPrimes)): #after sortign the list go through numbers bigger than i
if myPrimes[j] % myPrimes[i] == 0:
print ("found a composite number divisible by primes in the list - indedx is",j," number is", myPrimes[j])
includesAComposite = True #as we found one
break

if includesAComposite:
print("includes a composite or a duplicate prime so prime frequency data not checked")
else:
highestPrime = max(myPrimes)
lowerBound = highestPrime / log(highestPrime)
upperBound = lowerBound * 1.25506
if (len(myPrimes) > lowerBound and len(myPrimes) < upperBound):
print ("the number of primes " + str(len(myPrimes)) + " is within the bounds of ", lowerBound, upperBound)
else:
print ("out of bounds", len(myPrimes), lowerBound, upperBound)



return myPrimes

Euler Problem 10 - sum of primes below 2m

In the unlikely event that anyone is reading this blog, I am in my late forties with no programming experience since I bought a ZX81, so these are not the best programs in the world but I am learning!

Program approach

myPrimes is the list of primes starting at 2 as the first.
myMaximum is the limit of the prime numbers I want to find
testedNumber is 3 - i.e. the first number after 2 as the first seed prime
while loop carries on while the last prime found in myPrimes is less than myMaximum

Start the while loop assuming that testedNumber is a prime (i.e. isAPrime = True) then iterate through all the primes found to date in myPrimes to see if one of them goes into testedNumber. If one does then you can break the while loop.

After you have iterated through the primes in myPrimes If isAPrime remains true at the end then the testedNumber gets added to the list of primes and the number of primes found goes up by one.

What I learned

How to use while loops. In particular you can use another number as a counter in the while loop to keep track of stuff while the while loop is still working - in my case this incrementing primesFound by 1 and then using it to index myPrimes. In other words don't confuse for loops with while loops - while loops can monitor whether a condition is met or not without having to have an index.

Using the break function in for loops to stop wasting time/processing power.

Watch out for going through the while loop one too many times. In my case the while loop was looking at the last primeFound and stopped when it was over 2m. But this means that the last prime over 2m was included in the list.

'''
Created on 2012/09/24

@author: malcolm
'''
myPrimes=[2]
primesFound = 1
print(primesFound, myPrimes[primesFound-1])
myMaximum = 2000000
# primesWanted = 10001 carry over from problem 7
testedNumber = 3
while myPrimes[primesFound-1] < myMaximum :
isAPrime = True #testedNumber assumed to be a prime until flagged as False
for prime in myPrimes :
if testedNumber % prime == 0:
isAPrime = False
break #no point in carrying on with the for loop as testedNumber proven not to be a prime
if isAPrime: #isAPrime remains True after the for loop above is completed

myPrimes.append(testedNumber)
primesFound += 1 #only treat a prime as found if you add it to myPrimes! this keeps the while loop OK
if primesFound % 2000 == 0: print(primesFound,testedNumber) #help keep track of progress
testedNumber += 1 #the while loop will come to this last statement, test the expression and then carry on if needed

print(myPrimes)

cumulativePrimeTotal = 0
for prime in myPrimes:
cumulativePrimeTotal += prime

cumulativePrimeTotal -= myPrimes[len(myPrimes)-1] #take off very last prime as you went through the loop once to many times

print("the total is", cumulativePrimeTotal)

Saturday, November 3, 2012

Project Euler Question 18 - finding the longest route through a triangle

Euler Question 18 involved finding the longest route down a triangle.

This was done by giving numbers to each of the nodes in the triangle, working out the routes in number terms and then working out how to generate the same pattern and then working out how to list up all the routes in a list of lists.

The data - values for each node in the triangle - was put in one long string that was parsed into a dictionary.

The list of routes was then processed to find out the longest route.

Problems were:

(1) formulating exactly how to find the routes - this needed some diagramming out to work out that simply numbering the nodes in the triangle and then tracing through the numbers produced a regular pattern that could easily be adapted.

(2) Not realising that a recursive call to a function was the easiest way to actually process the routes. Initially I tried mixing in the actual route generation with loops.

(3) Not really understanding that a list variable pointed to the list and hence trying to copy a route in the form of a list did not really work. Instead I had to use extend to add to a nil list to copy.

(4) Also I then had to use append to extend the list with a new node number.

Here is some extend and append differences:

EXAMPLE 1:

append appends a single element. extend appends a list of elements.

Note that if you pass a list to append, it still adds one element:

>>> a = [1, 2, 3]
>>> a.append([4, 5, 6])
>>> a
[1, 2, 3, [4, 5, 6]]

EXAMPLE 2:

append adds an element to a list, extend concatenates the first list with another list. (Or another iterable, it doesn't have to be a list.)

>>> li = ['a', 'b', 'mpilgrim', 'z', 'example']
>>> li
['a', 'b', 'mpilgrim', 'z', 'example']
>>> li.append("new")               
>>> li
['a', 'b', 'mpilgrim', 'z', 'example', 'new']
>>> li.insert(2, "new")            
>>> li
['a', 'b', 'new', 'mpilgrim', 'z', 'example', 'new']
>>> li.extend(["two", "elements"]) 
>>> li
['a', 'b', 'new', 'mpilgrim', 'z', 'example', 'new', 'two', 'elements']

From Dive into Python .

Both from Stackoverflow extend vs append

'''
Created on 2012/11/03

@author: malcolm
'''

def generateRoutes(routesList,layer):

#passes a list of routes and the layer in which it is (you need this to work out the offset). Note you dont have to specify the type of argument passed

newRoutes = [] #construct a list of new routes - i.e. double the number of the routes past

for route in routesList:
madeRouteOne = []
madeRouteTwo = []
lastNode = route[len(route)-1] #Remember to use len(list)-1 for the index.
increment = lastNode + layer
madeRouteOne.extend(route) #you cannot go madeRoute1 = route as this simply points madeRoute1 to route in memory
madeRouteTwo.extend(route) #you have to use extend rather than append here - this is to get round the problems with copying
madeRouteOne.append(increment) #you have to use append rather than extend here
madeRouteTwo.append(increment + 1)
newRoutes.append(madeRouteOne) #and use append here
newRoutes.append(madeRouteTwo)

return newRoutes


myRoutes = [[1]] #start at the top of the triangle

for i in range(14): #14 is the number of layers in the Euler problem.
myRoutes = generateRoutes(myRoutes, i+1) #recursive - the return from the function call is used in the next function call

print(myRoutes) #to check it worked OK

numberString = "759564174782183587102004824765190123750334880277730763679965042806167092414126568340807033414872334732371694295371446525439152975114701133287773177839681757917152381714914358502729486366046889536730731669874031046298272309709873933853600423"
numberDic = {}

#numberDic holds the integer value taken from numberString for each point in the triangle, starting from number 1 at the peak
for i in range (0,240,2): #len(numberString) would be better than 240 here
numberDic[i/2+1] = int(numberString[i:i+2]) #remember range will only go up to 238


highestTotal = 0
highestRoute = myRoutes[0]
routeLength = len(myRoutes[0]) #to give a more general result

for route in myRoutes:
total = 0
for node in route:
total += numberDic[node]
if total > highestTotal:
highestTotal = total
highestRoute = route

print(highestTotal)
for i in range(len(highestRoute)):
print(numberDic[highestRoute[i]]) #highestRoute[i] goes through all the nodes in the highest route and then looks them up in the dictionary