4.7. Sequence Unpack Slice¶
Slice argument must be
int
(positive, negative or zero)Positive Index starts with
0
Negative index starts with
-1
4.7.1. Slice Forwards¶
sequence[start:stop]
>>> data = 'abcde'
>>> data[0:3]
'abc'
>>> data = 'abcde'
>>> data[2:5]
'cde'
4.7.2. Slice Defaults¶
sequence[start:stop]
start
defaults to0
stop
defaults tolen(sequence)
>>> data = 'abcde'
>>> data[:3]
'abc'
>>> data = 'abcde'
>>> data[3:]
'de'
>>> data = 'abcde'
>>> data[:]
'abcde'
4.7.3. Slice Backwards¶
Negative index starts from the end and go right to left
>>> data = 'abcde'
>>> data[-3:-1]
'cd'
>>> data = 'abcde'
>>> data[-3:]
'cde'
>>> data = 'abcde'
>>> data[0:-3]
'ab'
>>> data = 'abcde'
>>> data[:-3]
'ab'
>>> data = 'abcde'
>>> data[-3:0]
''
4.7.4. Step Forward¶
Every
n
-th elementsequence[start:stop:step]
start
defaults to0
stop
defaults tolen(sequence)
step
defaults to1
>>> data = 'abcde'
>>> data[::1]
'abcde'
>>> data = 'abcde'
>>> data[::2]
'ace'
>>> data = 'abcde'
>>> data[::3]
'ad'
>>> data = 'abcde'
>>> data[1:4:2]
'bd'
4.7.5. Step Backward¶
Every
n
-th elementsequence[start:stop:step]
start
defaults to0
stop
defaults tolen(sequence)
step
defaults to1
>>> data = 'abcde'
>>> data[::-1]
'edcba'
>>> data = 'abcde'
>>> data[::-2]
'eca'
>>> data = 'abcde'
>>> data[::-3]
'eb'
>>> data = 'abcde'
>>> data[4:1:-2]
'ec'
4.7.6. Slice Errors¶
>>> data = 'abcde'
>>> data[::0]
Traceback (most recent call last):
ValueError: slice step cannot be zero
>>> data = 'abcde'
>>> data[::1.0]
Traceback (most recent call last):
TypeError: slice indices must be integers or None or have an __index__ method
4.7.7. Out of Range¶
>>> data = 'abcde'
>>> data[:100]
'abcde'
>>> data = 'abcde'
>>> data[100:]
''
4.7.8. Slice str¶
>>> data = 'abcde'
>>>
>>>
>>> data[0:3]
'abc'
>>> data[3:5]
'de'
>>> data[:3]
'abc'
>>> data[3:]
'de'
>>> data[::1]
'abcde'
>>> data[::-1]
'edcba'
>>> data[::2]
'ace'
>>> data[::-2]
'eca'
>>> data[1::2]
'bd'
>>> data[1:4:2]
'bd'
4.7.9. Slice tuple¶
>>> data = ('a', 'b', 'c', 'd', 'e')
>>>
>>>
>>> data[0:3]
('a', 'b', 'c')
>>> data[3:5]
('d', 'e')
>>> data[:3]
('a', 'b', 'c')
>>> data[3:]
('d', 'e')
>>> data[::2]
('a', 'c', 'e')
>>> data[::-1]
('e', 'd', 'c', 'b', 'a')
>>> data[1::2]
('b', 'd')
>>> data[1:4:2]
('b', 'd')
4.7.10. Slice list¶
>>> data = ['a', 'b', 'c', 'd', 'e']
>>>
>>>
>>> data[0:3]
['a', 'b', 'c']
>>> data[3:5]
['d', 'e']
>>> data[:3]
['a', 'b', 'c']
>>> data[3:]
['d', 'e']
>>> data[::2]
['a', 'c', 'e']
>>> data[::-1]
['e', 'd', 'c', 'b', 'a']
>>> data[1::2]
['b', 'd']
>>> data[1:4:2]
['b', 'd']
4.7.11. Slice set¶
Slicing set
is not possible:
>>> data = {'a', 'b', 'c', 'd', 'e'}
>>>
>>> data[:3]
Traceback (most recent call last):
TypeError: 'set' object is not subscriptable
4.7.12. Nested Sequences¶
>>> DATA = [
... ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
... (5.8, 2.7, 5.1, 1.9, 'virginica'),
... (5.1, 3.5, 1.4, 0.2, 'setosa'),
... (5.7, 2.8, 4.1, 1.3, 'versicolor'),
... (6.3, 2.9, 5.6, 1.8, 'virginica'),
... (6.4, 3.2, 4.5, 1.5, 'versicolor'),
... (4.7, 3.2, 1.3, 0.2, 'setosa')]
>>>
>>>
>>> DATA[1:]
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa')]
>>>
>>> DATA[-3:]
[(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa')]
4.7.13. Column Selection¶
Column selection unfortunately does not work on list
:
>>> data = [[1, 2, 3],
... [4, 5, 6],
... [7, 8, 9]]
...
>>> data[:]
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>>
>>> data[:, 1]
Traceback (most recent call last):
TypeError: list indices must be integers or slices, not tuple
>>>
>>> data[:][1]
[4, 5, 6]
However this syntax is valid in numpy and pandas.
4.7.14. Index Arithmetic¶
>>> text = 'We choose to go to the Moon!'
>>> first = 23
>>> last = 28
>>> step = 2
>>>
>>> text[first:last]
'Moon!'
>>> text[first:last-1]
'Moon'
>>> text[first:last:step]
'Mo!'
>>> text[first:last-1:step]
'Mo'
4.7.15. Slice Function¶
Every
n
-th elementsequence[start:stop:step]
start
defaults to0
stop
defaults tolen(sequence)
step
defaults to1
>>> text = 'We choose to go to the Moon!'
>>>
>>> q = slice(23, 27)
>>> text[q]
'Moon'
>>>
>>> q = slice(None, 9)
>>> text[q]
'We choose'
>>>
>>> q = slice(23, None)
>>> text[q]
'Moon!'
>>>
>>> q = slice(23, None, 2)
>>> text[q]
'Mo!'
>>>
>>> q = slice(None, None, 2)
>>> text[q]
'W hoet ot h on'
4.7.16. Use Case - 0x01¶
>>> from pprint import pprint
>>>
>>>
>>> DATA = [
... ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
... (5.8, 2.7, 5.1, 1.9, 'virginica'),
... (5.1, 3.5, 1.4, 0.2, 'setosa'),
... (5.7, 2.8, 4.1, 1.3, 'versicolor'),
... (6.3, 2.9, 5.6, 1.8, 'virginica'),
... (6.4, 3.2, 4.5, 1.5, 'versicolor'),
... (4.7, 3.2, 1.3, 0.2, 'setosa')]
>>>
>>>
>>> pprint(DATA[1:])
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa')]
>>>
>>> pprint(DATA[1::2])
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.4, 3.2, 4.5, 1.5, 'versicolor')]
>>>
>>> pprint(DATA[1::-2])
[(5.8, 2.7, 5.1, 1.9, 'virginica')]
>>>
>>> pprint(DATA[:1:-2])
[(4.7, 3.2, 1.3, 0.2, 'setosa'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa')]
>>>
>>> pprint(DATA[:-5:-2])
[(4.7, 3.2, 1.3, 0.2, 'setosa'), (6.3, 2.9, 5.6, 1.8, 'virginica')]
>>>
>>> pprint(DATA[1:-5:-2])
[]
4.7.17. Use Case - 0x02¶
>>> data = [[1, 2, 3],
... [4, 5, 6],
... [7, 8, 9]]
...
>>> data[::2]
[[1, 2, 3],
[7, 8, 9]]
>>>
>>> data[::2][1]
[7, 8, 9]
>>>
>>> data[::2][:1]
[[1, 2, 3]]
>>>
>>> data[::2][1][1:]
[8, 9]
4.7.18. Use Case - 0x03¶
>>> text = 'We choose to go to the Moon!'
>>> word = 'Moon'
>>>
>>>
>>> start = text.find(word)
>>> stop = start + len(word)
>>>
>>> text[start:stop]
'Moon'
>>>
>>> text[:start]
'We choose to go to the '
>>>
>>> text[stop:]
'!'
>>>
>>> text[:start] + text[stop:]
'We choose to go to the !'
4.7.19. Assignments¶
"""
* Assignment: Sequence Slice Text
* Required: yes
* Complexity: easy
* Lines of code: 8 lines
* Time: 8 min
English:
1. Remove title and military rank in each variable
2. Remove also whitespaces at the beginning and end of a text
3. Use only `slice` to clean text
4. Run doctests - all must succeed
Polish:
1. Usuń tytuł naukowy i stopień wojskowy z każdej zmiennej
2. Usuń również białe znaki na początku i końcu tekstu
3. Użyj tylko `slice` do oczyszczenia tekstu
4. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert a is not Ellipsis, \
'Assign your result to variable `a`'
>>> assert b is not Ellipsis, \
'Assign your result to variable `b`'
>>> assert c is not Ellipsis, \
'Assign your result to variable `c`'
>>> assert d is not Ellipsis, \
'Assign your result to variable `d`'
>>> assert e is not Ellipsis, \
'Assign your result to variable `e`'
>>> assert f is not Ellipsis, \
'Assign your result to variable `f`'
>>> assert g is not Ellipsis, \
'Assign your result to variable `g`'
>>> assert type(a) is str, \
'Variable `a` has invalid type, should be str'
>>> assert type(b) is str, \
'Variable `b` has invalid type, should be str'
>>> assert type(c) is str, \
'Variable `c` has invalid type, should be str'
>>> assert type(d) is str, \
'Variable `d` has invalid type, should be str'
>>> assert type(e) is str, \
'Variable `e` has invalid type, should be str'
>>> assert type(f) is str, \
'Variable `f` has invalid type, should be str'
>>> assert type(g) is str, \
'Variable `g` has invalid type, should be str'
>>> example
'Mark Watney'
>>> a
'Pan Twardowski'
>>> b
'Pan Twardowski'
>>> c
'Mark Watney'
>>> d
'Melissa Lewis'
>>> e
'Ryan Stone'
>>> f
'Ryan Stone'
>>> g
'Pan Twardowski'
"""
EXAMPLE = 'lt. Mark Watney, PhD'
A = 'dr hab. inż. Pan Twardowski, prof. AATC'
B = 'gen. pil. Pan Twardowski'
C = 'Mark Watney, PhD'
D = 'lt. col. ret. Melissa Lewis'
E = 'dr n. med. Ryan Stone'
F = 'Ryan Stone, MD-PhD'
G = 'lt. col. Pan Twardowski\t'
example = EXAMPLE[4:-5]
# String with: 'Pan Twardowski'
# type: str
a = ...
# String with: 'Pan Twardowski'
# type: str
b = ...
# String with: 'Mark Watney'
# type: str
c = ...
# String with: 'Melissa Lewis'
# type: str
d = ...
# String with: 'Ryan Stone'
# type: str
e = ...
# String with: 'Ryan Stone'
# type: str
f = ...
# String with: 'Pan Twardowski'
# type: str
g = ...
"""
* Assignment: Sequence Slice Substr
* Required: yes
* Complexity: easy
* Lines of code: 3 lines
* Time: 5 min
English:
1. Use `str.find()` and slicing
2. Print `TEXT` without fragment from `REMOVE`
3. Output should be: 'We choose the Moon!'
4. Do not use `str.replace()`
5. Run doctests - all must succeed
Polish:
1. Użyj `str.find()` oraz wycinania
2. Wypisz `TEXT` bez fragmentu znajdującego się w `REMOVE`
3. Wynik powinien być: 'We choose the Moon!'
4. Nie używaj `str.replace()`
5. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is str, \
'Variable `result` has invalid type, should be str'
>>> result
'We choose the Moon!'
"""
TEXT = 'We choose to go to the Moon!'
REMOVE = 'to go to '
# String with TEXT without REMOVE part
# type: str
result = ...
"""
* Assignment: Sequence Slice Sequence
* Required: yes
* Complexity: easy
* Lines of code: 2 lines
* Time: 3 min
English:
1. Create set `result` with every second element from `a` and `b`
2. Run doctests - all must succeed
Polish:
1. Stwórz zbiór `result` z co drugim elementem `a` i `b`
2. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is set, \
'Variable `result` has invalid type, should be set'
>>> result
{0, 2, 4}
"""
a = (0, 1, 2, 3)
b = [2, 3, 4, 5]
# Set with every second element from `a` and `b`
# type: set[int]
result = ...
"""
* Assignment: Sequence Slice Header/Rows
* Required: yes
* Complexity: easy
* Lines of code: 2 lines
* Time: 3 min
English:
1. Separate header (first line) from rows:
a. Define `header: tuple[str]` with header
b. Define `rows: list[tuple]` with other rows
2. Run doctests - all must succeed
Polish:
1. Odseparuj nagłówek (pierwsza linia) od danych:
a. Zdefiniuj `header: tuple[str]` z nagłówkiem
b. Zdefiniuj `rows: list[tuple]` z pozostałymi wierszami
2. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert header is not Ellipsis, \
'Assign your result to variable `header`'
>>> assert rows is not Ellipsis, \
'Assign your result to variable `rows`'
>>> assert type(header) is tuple, \
'Variable `header` has invalid type, should be tuple'
>>> assert all(type(x) is tuple for x in rows), \
'All elements in `rows` should be tuple'
>>> assert header not in rows, \
'Header should not be in `rows`'
>>> header
('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species')
>>> rows # doctest: +NORMALIZE_WHITESPACE
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(7.6, 3.0, 6.6, 2.1, 'virginica'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(4.9, 2.5, 4.5, 1.7, 'virginica')]
"""
DATA = [
('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(7.6, 3.0, 6.6, 2.1, 'virginica'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(4.9, 2.5, 4.5, 1.7, 'virginica')]
# Tuple with row at index 0 from DATA
# type: tuple[str]
header = ...
# List with rows at all the other indexes from DATA
# type: list[tuple]
rows = ...
"""
* Assignment: Sequence Slice Train/Test
* Required: yes
* Complexity: easy
* Lines of code: 4 lines
* Time: 8 min
English:
1. Divide `rows` into two lists:
a. `train`: 60% - training data
b. `test`: 40% - testing data
2. Calculate split point:
a. `rows` length multiplied by percent
b. From `rows` slice training data from start to split
c. From `rows` slice test data from split to end
3. Run doctests - all must succeed
Polish:
1. Podziel `rows` na dwie listy:
a. `train`: 60% - dane do uczenia
b. `test`: 40% - dane do testów
2. Aby to zrobić wylicz punkt podziału:
a. Długość `rows` razy procent
c. Z `rows` wytnij do uczenia rekordy od początku do punktu podziału
d. Z `rows` zapisz do testów rekordy od punktu podziału do końca
3. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert split is not Ellipsis, \
'Assign your result to variable `split`'
>>> assert train is not Ellipsis, \
'Assign your result to variable `train`'
>>> assert test is not Ellipsis, \
'Assign your result to variable `test`'
>>> assert type(split) is int, \
'Variable `split` has invalid type, should be int'
>>> assert type(train) is list, \
'Variable `train` has invalid type, should be list'
>>> assert type(train) is list, \
'Variable `train` has invalid type, should be list'
>>> assert type(test) is list, \
'Variable `test` has invalid type, should be list'
>>> assert all(type(x) is tuple for x in train), \
'All elements in `train` should be tuple'
>>> assert all(type(x) is tuple for x in test), \
'All elements in `test` should be tuple'
>>> split
6
>>> train # doctest: +NORMALIZE_WHITESPACE
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa')]
>>> test # doctest: +NORMALIZE_WHITESPACE
[(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(7.6, 3.0, 6.6, 2.1, 'virginica'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(4.9, 2.5, 4.5, 1.7, 'virginica')]
"""
DATA = [
('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(7.6, 3.0, 6.6, 2.1, 'virginica'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(4.9, 2.5, 4.5, 1.7, 'virginica')]
header = DATA[0]
rows = DATA[1:]
# Result of `rows` length multiplied by percent
# type: int
split = ...
# List with first 60% from rows
# type: list[tuple]
train = ...
# List with last 40% from rows
# type: list[tuple]
test = ...