2.8. Pandas Read Python

  • pd.DataFrame()

2.8.1. Example

2.8.2. Assignments

Code 2.35. Solution
"""
* Assignment: Pandas Read PythonDict
* Complexity: medium
* Lines of code: 8 lines
* Time: 13 min

English:
    1. Convert `DATA` to format with one column per each attrbute for example:
       a. `mission1_year`, `mission2_year`,
       b. `mission1_name`, `mission2_name`
    2. Note, that enumeration starts with one
    3. Convert data to `result: pd.DataFrame`
    4. Run doctests - all must succeed

Polish:
    1. Przekonweruj `DATA` do formatu z jedną kolumną dla każdego atrybutu, np:
       a. `mission1_year`, `mission2_year`,
       b. `mission1_name`, `mission2_name`
    2. Zwróć uwagę, że enumeracja zaczyna się od jeden
    3. Przekonwertuj dane do `result: pd.DataFrame`
    4. Uruchom doctesty - wszystkie muszą się powieść

Hint:
    * dict.pop()
    * enumerate(..., start=1)
    * column_name = f'mission{i}_{field}'

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is pd.DataFrame, \
    'Variable `result` has invalid type, should be `pd.DataFrame`'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
      firstname  lastname mission1_year mission1_name mission2_year mission2_name
    0      Mark    Watney          2035         Ares3           NaN           NaN
    1   Melissa     Lewis          2030         Ares1          2035         Ares3
    2      Rick  Martinez           NaN           NaN           NaN           NaN
"""
import pandas as pd


DATA = [
    {"firstname": "Mark", "lastname": "Watney", "missions": [
        {"year": "2035", "name": "Ares3"}]},

    {"firstname": "Melissa", "lastname": "Lewis", "missions": [
         {"year": "2030", "name": "Ares1"},
         {"year": "2035", "name": "Ares3"}]},

    {"firstname": "Rick", "lastname": "Martinez", "missions": []}
]


# Define variable data with flatten ``DATA``
# Each mission field prefixed with mission and number
# type: list[dict]
data = ...

# Convert `data` to DataFrame
# type: pd.DataFrame
result = ...


Code 2.36. Solution
"""
* Assignment: Pandas Read PythonObj
* Complexity: medium
* Lines of code: 7 lines
* Time: 13 min

English:
    1. Read data from `DATA` as `result: pd.DataFrame`
    2. Non-functional requirements:
        a. Use `,` to separate mission fields
        b. Use `;` to separate missions
    2. Run doctests - all must succeed

Polish:
    1. Wczytaj dane z DATA jako result: pd.DataFrame
    2. Wymagania niefunkcjonalne:
        a. Użyj `,` do oddzielania pól mission
        b. Użyj `;` do oddzielenia missions
    2. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `vars(obj)`
    * dict.pop()
    * Nested `for`
    * `str.join(';', sequence)`
    * `str.join(',', sequence)`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> assert type(result) is pd.DataFrame, \
    'Variable `result` has invalid type, should be `pd.DataFrame`'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
      firstname  lastname                 missions
    0      Mark    Watney              2035,Ares 3
    1   Melissa     Lewis  2030,Ares 1;2035,Ares 3
    2      Rick  Martinez
"""

import pandas as pd


class Astronaut:
    def __init__(self, firstname, lastname, missions=None):
        self.firstname = firstname
        self.lastname = lastname
        self.missions = list(missions) if missions else []


class Mission:
    def __init__(self, year, name):
        self.year = year
        self.name = name


DATA = [
    Astronaut('Mark', 'Watney', missions=[
        Mission(2035, 'Ares 3')]),

    Astronaut('Melissa', 'Lewis', missions=[
        Mission(2030, 'Ares 1'),
        Mission(2035, 'Ares 3')]),

    Astronaut('Rick', 'Martinez', missions=[]),
]


# Convert DATA to list[dict], then flatten
# type: list[dict]
data = ...

# Convert `data` to DataFrame
# type: pd.DataFrame
result = ...