UNB/ CS/ David Bremner/ tags/ python

This feed contains pages with tag "python".

Before the lab


Broadcasting

Time
25 minutes
Activity
Demo/Discussion

In Octave we can multiply every element of a matrix by a scalar using the .* operator

A=[1,2,3;
   4,5,6];
B=A.*2

In general .* supports any two arguments of the same size.

C=A .* [2,2,2; 2,2,2]

It turns out these are actually the same operation, since Octave converts the first into the second via broadcasting

Quoting from the Octave docs, for element-wise binary operators and functions

The rule is that corresponding array dimensions must either be equal, or one of them must be 1.

In the case where one if the dimensions is 1, the smaller matrix is tiled to match the dimensions of the larger matrix.

Here's another example you can try.

x = [1 2 3;
     4 5 6;
     7 8 9];

y = [10 20 30];

x + y

Reshaping arrays

One potentially surprising aspect of Octave arrays is that the number of dimensions is independent from the number of elements. We can add as many dimensions as we like, as long as the only possible index in those dimensions is 1. This can be particularly useful when trying to broadcast with higher dimensional arrays.

     ones(3,3,3) .* reshape([1,2,3],[1,1,3])
     ones(3,3,3) .* reshape([1,2,3],[1,3,1])

Scaling layers of arrays

Time
25 minutes
Activity
Individual

Complete the following function. You may want to copy the definitions of A and B into the REPL to understand the use of cat.

## usage: scale_layers(array, weights)
##
## multiply each layer of a 3D array by the corresponding weight
function out = scale_layers(array, weights)
  out =
endfunction

%!test
%! onez = ones(3,3);
%! A=cat(3,onez, 2*onez, 3*onez);
%! B=cat(3,onez, 6*onez, 15*onez);
%! assert(scale_layers(A,[1;3;5]),B)

Scaling a colour channel

Save the image above left as ~/cs2613/labs/L21/paris.jpg (make sure you get the full resolution image, and not the thumbnail).

Run the following demo code; you can change the weight vector for different colourization.

paris=imread("paris.jpg");
sepia=scale_layers(paris,[0.9,0.62,0.34]);
imshow(sepia);

You should get something like the following


The second half of the lab will be a quiz on Python.


Before next lab

Posted Tags: /tags/python
CS2613 Assignment 4

Overview

This assignment is based on the material covered in Lab 15 and Lab 16.

The goal of the assignment is to develop a simple query language that lets the user select rows and columns from a CSV File, in effect treating it like database.

General Instructions

2013-100.csv.gz

2013-1000.csv.gz

2014-100.csv.gz

2015-1000.csv.gz

2014-1000.csv.gz

2015-100.csv.gz

Reading CSV Files

We will use the builtin Python CSV module to read CSV files.

def read_csv(filename):
    '''Read a CSV file, return list of rows'''
    import csv
    with open(filename,'rt',newline='') as f:
        reader = csv.reader(f, skipinitialspace=True)
        return [ row for row in reader ]

Save the following as “~/fcshome/assignments/A4/test1.csv”; we will use it several tests. You should also construct your own example CSV files and corresponding tests.

name, age, eye colour
Bob, 5, blue
Mary, 27, brown
Vij, 54, green

Here is a test to give you the idea of the returned data structure from read_csv.

def test_read_csv():
    assert read_csv('test1.csv') == [['name', 'age', 'eye colour'],
                                     ['Bob', '5', 'blue'],
                                     ['Mary', '27', 'brown'],
                                     ['Vij', '54', 'green']]

Parsing Headers

The first row most in most CSV files consists of column labels. We will use this to help the user access columns by name rather than by counting columns.

Write a function header_map that builds a dictionary from labels to column numbers.

table = read_csv('test1.csv')

def test_header_map_1():
    hmap = header_map(table[0])
    assert hmap == { 'name': 0, 'age': 1, 'eye colour': 2 }

Transforming rows into dictionaries

Sometimes it’s more convenient to work with rows of the table as dictionaries, rather than passing around the map of column labels everwhere. Write a function row2dict that takes the output from headermap, and a row, and returns a dictionary representing that row (column order is lost here, but that will be ok in our application).

def test_row2dict():
    hmap = header_map(table[0])
    assert row2dict(hmap, table[1]) == {'name': 'Bob', 'age': '5', 'eye colour': 'blue'}

Matching rows

We are going to write a simple query languge where each query is a 3-tuple (left, op, right), and op is one of =, <, and >. In the initial version, left and right are numbers or strings. Strings are interpreted as follows: if they are column labels, retrieve the value in that column; otherwise treat it as a literal string. With this in mind, write a function check_row that takes a row in dictionary form, and checks if it matches a query tuple.

def test_check_row():
    row = {'name': 'Bob', 'age': '5', 'eye colour': 'blue'}
    assert check_row(row, ('age', '=', 5))
    assert not check_row(row, ('eye colour', '=', 5))
    assert check_row(row, ('eye colour', '=', 'blue'))
    assert check_row(row, ('age', '>', 4))
    assert check_row(row, ('age', '<', 1000))

Extending the query language

Extend check_row so that it supports operations AND and OR. For these cases both left and right operands must be queries. Hint: this should only be a few more lines of code.

def test_check_row_logical():
    row = {'name': 'Bob', 'age': '5', 'eye colour': 'blue'}
    assert check_row(row, (('age', '=', 5),'OR',('eye colour', '=', 5)))
    assert not check_row(row, (('age', '=', 5),'AND',('eye colour', '=', 5)))

Filtering tables

Use you previously developed functions to impliment a function filter_table that selects certain rows of the table according to a query.

def test_filter_table1():
    assert filter_table(table,('age', '>', 0)) == [['name', 'age', 'eye colour'],
                                                   ['Bob', '5', 'blue'],
                                                   ['Mary', '27', 'brown'],
                                                   ['Vij', '54', 'green']]

    assert filter_table(table,('age', '<', 28)) == [['name', 'age', 'eye colour'],
                                                    ['Bob', '5', 'blue'],
                                                    ['Mary', '27', 'brown']]

    assert filter_table(table,('eye colour', '=', 'brown')) == [['name', 'age', 'eye colour'],
                                                                ['Mary', '27', 'brown']]

    assert filter_table(table,('name', '=', 'Vij')) == [['name', 'age', 'eye colour'],
                                                        ['Vij', '54', 'green']]


def test_filter_table2():
    assert filter_table(table,(('age', '>', 0),'AND',('age','>','26'))) == [['name', 'age', 'eye colour'],
                                                                            ['Mary', '27', 'brown'],
                                                                            ['Vij', '54', 'green']]


    assert filter_table(table,(('age', '<', 28),'AND',('age','>','26'))) == [['name', 'age', 'eye colour'],
                                                                             ['Mary', '27', 'brown']]

    assert filter_table(table,(('eye colour', '=', 'brown'),
                               'OR',
                               ('name','=','Vij'))) == [['name', 'age', 'eye colour'],
                                                        ['Mary', '27', 'brown'],
                                                        ['Vij', '54', 'green']]
Posted Tags: /tags/python
Lab 19

Before the lab

Getting started

Generators and iterator classes

Time
25 minutes
Activity
individual

Consider the countdown generator from Section 6.2

def countdown(n):
    print('Counting down from', n)
    while n > 0:
        yield n
        n -= 1

Referring to Section 6.1 an equivalent iterator class. Here we will implement the entire protocol with one class, but multiple classes like we did for JavaScript is also reasonable. Modify only the __next__ method of the following skeleton class so that given tests pass.

def countdown(n):
    yield 'starting'
    while n > 0:
        yield n
        n -= 1

def test_generator():
    counter = countdown(3)
    assert [ t for t in counter ] == \
        ['starting', 3,  2,  1]

class Counter:
    "Iterator class simulating countdown"
    def __init__(self,n):
        self.n = n
        self.first = True

    def __iter__(self):
        return self
    
    def __next__(self):
        # insert code here
        self.n -= 1
        return self.n+1

def test_class():
    counter = Counter(3)
    assert [ t for t in counter ] == \
        ['starting', 3,  2,  1]

What is __iter__ for?

Time
25 minutes
Activity
individual

Update the __iter__ method of your previous solution so that the following additional test passes

def test_twice():
    counter = Counter(3)
    assert [ t for t in counter ] == \
        ['starting', 3,  2,  1]
    assert [ t for t in counter ] == \
        ['starting', 3,  2,  1]

Using Generators I

Time
25 minutes
Activity
individual

Do Exercises 6.5 and 6.6.

Note that 6.5 just asks you to run some code, but you do have to be in the correct directory when running it.

For 6.6, you essentially need to split the given code so that the first part is the new generator follow.

Using Generators II

Time
25 minutes
Activity
individual

Start with the following stock related classes:

tableformat.py

fileparse.py

portfolio.py

report.py

stock.py

Complete Exercise 6.7. I suggest you make a new file follow2.py so that you preserve solutions for 6.6 and 6.7.

Posted Tags: /tags/python
Lab 18

Before the Lab

Getting started


Methods

Time
25 minutes
Activity
Write accessor and mutator methods.

def test_cost2():
    s = Stock('GOOG', 100, 490.10)
    assert s.cost() == pytest.approx(49010.0,0.001)

def test_sell():
    s = Stock('GOOG', 100, 490.10)
    s.sell(25)
    assert s.shares == 75
    assert s.cost() == pytest.approx(36757.5, 0.001)

Special Methods

Time
25 minutes
Activity
Code to test, learn about "dunder" methods.
def test_repr():
    goog = Stock('GOOG', 100, 490.1)
    assert repr(goog) == "Stock('GOOG', 100, 490.1)"

Static methods

Time
25 minutes
Activity
Transform code, use template for static methods
    @staticmethod
    def read_portfolio(filename):
        # code from 4.3 goes here

Your completed static method should pass

def test_read_portfolio():
    portfolio = Stock.read_portfolio('Data/portfolio.csv')
    assert repr(portfolio[0:3]) == \
        "[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44)]"

Inheritance

Time
25 minutes
Activity
Refactor given code, work with class hierarchy

Start with following class hierarchy based on Exercises 4.5 and 4.6

class TableFormatter:
    def headings(self, headers):
        '''
        Emit the table headings.
        '''
        raise NotImplementedError()

    def row(self, rowdata):
        '''
        Emit a single row of table data.
        '''
        raise NotImplementedError()

class TextTableFormatter(TableFormatter):
    '''
    Emit a table in plain-text format
    '''
    def headings(self, headers):
        
        output = ''
        for h in headers:
            output += f'{h:>10s} '
        output+='\n'
        output+=(('-'*10 + ' ')*len(headers))
        output += '\n'
        return output
    
    def row(self, rowdata):
        output = ''
        for d in rowdata:
            output+=f'{d:>10s} '
        output += '\n'
        return output

def test_text_2():
    portfolio=stock.Stock.read_portfolio('Data/portfolio.csv')
    formatter= TextTableFormatter()
    output= formatter.headings(['Name','Shares','Price', 'Cost'])
    for obj in portfolio[0:3]:
        output +=formatter.row([obj.name,f'{obj.shares}',
                                f'{obj.price:0.2f}',f'{obj.cost():0.2f}'])

    assert '\n' + output == '''
      Name     Shares      Price       Cost 
---------- ---------- ---------- ---------- 
        AA        100      32.20    3220.00 
       IBM         50      91.10    4555.00 
       CAT        150      83.44   12516.00 
'''
 
def test_string_1():
    portfolio=stock.Stock.read_portfolio('Data/portfolio.csv')
    formatter= StockTableFormatter()
    output= formatter.headings(['Name','Shares','Price', 'Cost'])
    for obj in portfolio[0:3]:
        output +=formatter.row(obj)

    assert '\n' + output == '''
      Name     Shares      Price       Cost 
---------- ---------- ---------- ---------- 
        AA        100      32.20    3220.00 
       IBM         50      91.10    4555.00 
       CAT        150      83.44   12516.00 
'''

Before next lab

Read

Posted Tags: /tags/python
Lab 17

Before the Lab


Getting Started

Higher order functions

Time
20 minutes
Activity
Assemble given pieces
def test_portfolio():
    portfolio = parse_csv('Data/portfolio.csv', types=[str, int, float])
    assert portfolio == [{'price': 32.2, 'name': 'AA', 'shares': 100},
                         {'price': 91.1, 'name': 'IBM', 'shares': 50},
                         {'price': 83.44, 'name': 'CAT', 'shares': 150},
                         {'price': 51.23, 'name': 'MSFT', 'shares': 200},
                         {'price': 40.37, 'name': 'GE', 'shares': 95},
                         {'price': 65.1, 'name': 'MSFT', 'shares': 50},
                         {'price': 70.44, 'name': 'IBM', 'shares': 100}]

def test_shares():
    shares_held = parse_csv('Data/portfolio.csv', select=['name', 'shares'], types=[str, int])
    assert shares_held == [{'name': 'AA', 'shares': 100}, {'name': 'IBM', 'shares': 50},
                           {'name': 'CAT', 'shares': 150}, {'name': 'MSFT', 'shares': 200},
                           {'name': 'GE', 'shares': 95}, {'name': 'MSFT', 'shares': 50},
                           {'name': 'IBM', 'shares': 100}]

Refactoring a function

Time
30 minutes
Activity
Refactor function to add feature
def test_tuple():
    prices = parse_csv('Data/prices.csv', types=[str,float], has_headers=False)
    assert prices == [('AA', 9.22), ('AXP', 24.85), ('BA', 44.85), ('BAC', 11.27),
                      ('C', 3.72), ('CAT', 35.46), ('CVX', 66.67), ('DD', 28.47),
                      ('DIS', 24.22), ('GE', 13.48), ('GM', 0.75), ('HD', 23.16),
                      ('HPQ', 34.35), ('IBM', 106.28), ('INTC', 15.72), ('JNJ', 55.16),
                      ('JPM', 36.9), ('KFT', 26.11), ('KO', 49.16), ('MCD', 58.99),
                      ('MMM', 57.1), ('MRK', 27.58), ('MSFT', 20.89), ('PFE', 15.19),
                      ('PG', 51.94), ('T', 24.79), ('UTX', 52.61), ('VZ', 29.26),
                      ('WMT', 49.74), ('XOM', 69.35)]

JavaScript Quiz

The second half of the lab will be a quiz on JavaScript

Before next lab

Read

Posted Tags: /tags/python
Lab 16

Before the Lab


Getting Started

Files

Time
35 minutes
Activity
Individual programming, synthesis

Data types

Time
25 minutes
Activity
Convert REPL session to script with tests
def test_cost():
    d = parse_row(row)
    cost = compute_cost (d)
    assert cost == pytest.approx(3220.000,abs=0.00000001)

def test_d():
    d = parse_row(row)
    update_dict(d)
    assert d == {'name': 'AA', 'shares': 75, 'price':32.2, 'date': (6, 11, 2007),
                 'account': 12345}

Containers: read_portfolio

Time
25 minutes
Activity
Write function based on template
def test_read():
    portfolio = read_portfolio('Data/portfolio.csv')
    assert portfolio == \
        [('AA', 100, 32.2), ('IBM', 50, 91.1),
         ('CAT', 150, 83.44), ('MSFT', 200, 51.23),
         ('GE', 95, 40.37), ('MSFT', 50, 65.1), ('IBM', 100, 70.44)]

def test_total():
    portfolio = read_portfolio('Data/portfolio.csv')
    total = 0.0
    for name, shares, price in portfolio:
        total += shares*price
    assert total == pytest.approx(44671.15,abs=0.001)
def test_read():
    portfolio = read_portfolio('Data/portfolio.csv')
    assert portfolio == \
        [{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1},
         {'name': 'CAT', 'shares': 150, 'price': 83.44}, {'name': 'MSFT', 'shares': 200, 'price': 51.23},
         {'name': 'GE', 'shares': 95, 'price': 40.37}, {'name': 'MSFT', 'shares': 50, 'price': 65.1},
         {'name': 'IBM', 'shares': 100, 'price': 70.44}]

def test_total():
    portfolio = read_portfolio('Data/portfolio.csv')

    total = 0.0
    for s in portfolio:
        total += s['shares']*s['price']
    assert total == pytest.approx(44671.15,abs=0.001)

Sequences

Time
25 minutes
Activity
Convert REPL session to tests
def test_items():
    assert prices.items() == \
        [('GOOG', 490.1), ('AA', 23.45), ('IBM', 91.1), ('MSFT', 34.23)]
def test_zip():
    assert pricelist == \
        [(490.1, 'GOOG'), (23.45, 'AA'), (91.1, 'IBM'), (34.23, 'MSFT')]

def test_min_max():
    assert min(pricelist) == (23.45, 'AA')
    assert max(pricelist) == (490.1, 'GOOG')

List Comprehensions

Time
25 minutes
Activity
Write tests based on REPL session
import pytest
from report2 import read_portfolio
prices = {
        'GOOG' : 490.1,
        'AA' : 23.45,
        'CAT': 35.46,
        'IBM' : 91.1,
        'MSFT' : 34.23,
        'GE': 13.48,
    }

Note that the value test will need to be adjusted for a value of approximately 31167.10, since our price list is different from the one used in the book.


Before next lab

Read

Posted Tags: /tags/python
Lab 15

Before the Lab

Questions

Time
5 Minutes
Activity
Group discussion

Getting Started

Mortgage Calculator

Time
20 Minutes
Activity
Modify given program

Do exercises 1.7 and 1.8 from Python Numbers

Introduction to Python strings

Time
20 Minutes
Activity
Experiment in Python REPL

Do exercises 1.13, 1.14, 1.15 from Python Strings


Pytest

Time
10 minutes
Activity
Demo
symbols = 'HPQ,AAPL,IBM,MSFT,YHOO,DOA,GOOG'
symlist = symbols.split(',')

def test_lookup0():
    assert symlist[0] == 'HPQ'

def test_lookup1():
    assert symlist[1] == 'AAPL'
[student@id414m22 L14]$ pytest listex.py
=================== test session starts ===================
platform linux -- Python 3.9.18, pytest-7.4.3, pluggy-1.3.0
rootdir: /home1/ugrad/student/cs2613/labs/L14
plugins: pylama-8.4.1, cov-4.1.0
collected 2 items

listex.py ..                                        [100%]

==================== 2 passed in 0.02s ====================

Lists and Pytest

Time
20 minutes
Activity
Individual programming from template

Functions and coverage

Time
20 minutes
Activity
Individual programming, modify previous solution

We have already been using python functions for pytest, without really thinking about how they work. In Part 1.7 of Practical Python, functions are explained.

def sumcount(n):
    '''
    Returns the sum of the first n integers
    '''
    total = 0
    while n > 0:
        total += n
        n -= 1
    return total

Before next lab

Read

Posted Tags: /tags/python
Indexing Debian's buildinfo

Introduction

Debian is currently collecting buildinfo but they are not very conveniently searchable. Eventually Chris Lamb's buildinfo.debian.net may solve this problem, but in the mean time, I decided to see how practical indexing the full set of buildinfo files is with sqlite.

Hack

  1. First you need a copy of the buildinfo files. This is currently about 2.6G, and unfortunately you need to be a debian developer to fetch it.

     $ rsync -avz mirror.ftp-master.debian.org:/srv/ftp-master.debian.org/buildinfo .
    
  2. Indexing takes about 15 minutes on my 5 year old machine (with an SSD). If you index all dependencies, you get a database of about 4G, probably because of my natural genius for database design. Restricting to debhelper and dh-elpa, it's about 17M.

     $ python3 index.py
    

    You need at least python3-debian installed

  3. Now you can do queries like

     $ sqlite3 depends.sqlite "select * from depends where depend='dh-elpa' and depend_version<='0106'"
    

    where 0106 is some adhoc normalization of 1.6

Conclusions

The version number hackery is pretty fragile, but good enough for my current purposes. A more serious limitation is that I don't currently have a nice (and you see how generous my definition of nice is) way of limiting to builds currently available e.g. in Debian unstable.

Posted Tags: /tags/python
Trivial example using python to hack ical

I could not find any nice examples of using the vobject class to filter an icalendar file. Here is what I got to work. I'm sure there is a nicer way. This strips all of the valarm subevents (reminders) from an icalendar file.

import vobject
import sys

cal=vobject.readOne(sys.stdin)

for ev in cal.vevent_list:
    if ev.contents.has_key(u'valarm'):
       del ev.contents[u'valarm']

print cal.serialize()
Posted Tags: /tags/python