Data Types: A brief overview

1.2. Data Types: A brief overview#

Data types are fundamental classifications that describe the nature of values in a programming language. They specify the kind of data a variable can hold and the operations that can be performed on that data. In Python, which is a dynamically-typed language, data types are determined at runtime. Here are some common data types in Python:

Numeric Types:
- int: Represents integer values, such as -3, 0, or 42. These are whole numbers without a fractional part.
- float: Represents floating-point numbers with a decimal point or in scientific notation, like 3.14 or 2.5e-3 (which means \(2.5 \times 10^{-3}\)).
String Type (str):
- Represents sequences of characters enclosed in single (’ ‘) or double (” “) quotes.
- Example: "Hello, World!". Strings support various operations like concatenation, slicing, and formatting.
Boolean Type (bool):
- Represents the values True or False, which are used in logical expressions and conditions.
- Example: True or False. Boolean operations include and, or, and not.
List Type (list):
- Represents ordered collections of items, which can be of different data types.
- Lists are mutable, allowing addition, removal, or modification of elements.
- Example: [1, 2, "apple", True]. Access elements by index, e.g., my_list[0].
Tuple Type (tuple):
- Represents ordered collections of items like lists, but tuples are immutable (cannot be changed after creation).
- Tuples are often used to ensure data integrity.
- Example: (3, 7, "banana"). Access elements by index, similar to lists.
Dictionary Type (dict):
- Represents key-value pairs, where each value is associated with a unique key.
- Dictionaries are useful for quick data lookups, and the keys must be immutable (e.g., strings, numbers, or tuples).
- Example: {"name": "Alice", "age": 30, "city": "Wonderland"}. Access values using keys, e.g., my_dict["age"].
Set Type (set):
- Represents unordered collections of unique elements, useful for tasks like removing duplicates.
- Sets do not allow duplicate values and support set operations like union, intersection, and difference.
- Example: {1, 2, 3, 3, 4}. Create a set using curly braces, e.g., my_set = {1, 2, 3}.
None Type (NoneType):
- Represents the absence of a value or a null value.
- Often used to indicate the absence of a meaningful result or to initialize variables that will be assigned a value later.
- Example: None.

Note

In Chapter Data Structures and File Handling in Python, we will discuss various data structures.

Table 1.1 demonstrates some of applications of each of the eight data types in Python programming tasks [Downey, 2015, Python Software Foundation, 2024].

Table 1.1 Applications of Python Data Types in Programming Tasks#
Data Type	Applications
Numeric Types	Arithmetic operations, calculations, storing/manipulating numerical data, handling quantities, scientific/engineering.
String Type	Working with textual data, input/output, parsing, formatting, manipulating strings, user interfaces, generating reports.
Boolean Type	Implementing conditionals, decision-making, logical expressions, comparisons, handling binary states.
List Type	Storing ordered collections, dynamic data structures, stacks/queues, managing/organizing data.
Tuple Type	Ensuring data integrity, returning multiple values from functions, representing structured data.
Dictionary Type	Efficient data retrieval/manipulation with key-value pairs, data caches, configuration management, indexing.
Set Type	Removing duplicates, membership checks, set operations, uniqueness in algorithms.
None Type	Representing absence of value, indicating lack of meaningful result, initializing variables.

1.2.1. Data Type Conversion Functions#

In Python, several data conversion functions allow you to convert between different data types. Here are some commonly used data conversion functions [Python Software Foundation, 2024]:

int(x, base=10)
- Converts x to an integer. The base parameter specifies the base to interpret the input string (default is 10).
- Example: int('42') returns 42.
float(x)
- Converts x to a floating-point number.
- Example: float('3.14') returns 3.14.
str(x)
- Converts object x to a string representation.
- Example: str(42) returns '42'.
chr(i)
- Returns a string representing a character whose Unicode code point is the integer i.
- Example: chr(65) returns 'A'.
ord(c)
- Returns an integer representing the Unicode character.
- Example: ord('A') returns 65.
bool(x)
- Converts x to a Boolean value (True or False).
- Example: bool(0) returns False.
list(iterable)
- Converts an iterable (e.g., a tuple, string, or set) to a list.
- Example: list((1, 2, 3)) returns [1, 2, 3].
tuple(iterable)
- Converts an iterable to a tuple.
- Example: tuple([1, 2, 3]) returns (1, 2, 3).
set(iterable)
- Converts an iterable to a set.
- Example: set([1, 2, 3, 3]) returns {1, 2, 3}.
dict()
- Creates a new dictionary.
- Example: dict(a=1, b=2) returns {'a': 1, 'b': 2}.
str.encode(encoding=‘UTF-8’, errors=‘strict’)
- Encodes the string using the specified encoding.
- Example: 'hello'.encode('UTF-16') returns b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'.
bytes.decode(encoding=‘UTF-8’, errors=‘strict’)
- Decodes a bytes object to a string using the specified encoding.
- Example: b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'.decode('UTF-16') returns 'hello'.

These functions are essential for handling different data types and performing type conversions in Python.

Table 1.2 Common Python Conversion Functions and Their Usage#
Conversion Function	Description	Example Usage
int(x, base=10)	Converts x to an integer.	`int('42')` returns `42`.
float(x)	Converts x to a floating-point number.	`float('3.14')` returns `3.14`.
str(x)	Converts object x to a string representation.	`str(42)` returns `'42'`.
chr(i)	Returns a string representing a Unicode character.	`chr(65)` returns `'A'`.
ord(c)	Returns the Unicode code point of a character.	`ord('A')` returns `65`.
bool(x)	Converts x to a Boolean value (True or False).	`bool(0)` returns `False`.
list(iterable)	Converts an iterable to a list.	`list((1, 2, 3))` returns `[1, 2, 3]`.
tuple(iterable)	Converts an iterable to a tuple.	`tuple([1, 2, 3])` returns `(1, 2, 3)`.
set(iterable)	Converts an iterable to a set.	`set([1, 2, 3, 3])` returns `{1, 2, 3}`
dict()	Creates a new dictionary.	`dict(a=1, b=2)` returns `{'a': 1, 'b': 2}`.
str.encode(encoding=‘UTF-8’, errors=‘strict’)	Encodes a string using the specified encoding.	`'hello'.encode('UTF-16')` returns `b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'`.
bytes.decode(encoding=‘UTF-8’, errors=‘strict’)	Decodes a bytes object to a string using the specified encoding.	`b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'.decode('UTF-16')` returns `'hello'`.

Note - Encoding and Decoding in Python

In Python, the methods str.encode() and bytes.decode() are used to convert between string objects (str) and bytes objects (bytes). These conversions are essential when dealing with different character encodings, especially in contexts like file I/O and network communication.

Encoding: Encoding is the process of converting a string (textual data) into bytes (binary data). This is necessary when storing or transmitting text in a format that requires binary representation.
```
# Example of encoding a string to bytes
encoded_string = 'hello'.encode('UTF-16')
print(encoded_string)  # Output: b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'
```
- str.encode(encoding='UTF-8', errors='strict'):
- encoding: Specifies the character encoding to use (default is 'UTF-8').
- errors: Specifies how to handle characters that cannot be encoded (default is 'strict', which raises an error). Other options include 'ignore', 'replace', and more.
Decoding: Decoding is the process of converting bytes (binary data) back into a string (textual data). This is necessary when reading or receiving binary data that needs to be interpreted as text.
```
# Example of decoding bytes to a string
decoded_bytes = b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'.decode('UTF-16')
print(decoded_bytes)  # Output: 'hello'
```
- bytes.decode(encoding='UTF-8', errors='strict'):
- encoding: Specifies the character encoding used during the decoding process (default is 'UTF-8').
- errors: Specifies how to handle decoding errors (default is 'strict', which raises an error). Other options include 'ignore', 'replace', etc.
Key Points
- Textual Data: Typically stored and manipulated as strings (str).
- Binary Data: Represented as bytes (bytes), essential for file I/O, network communication, etc.
- Encoding and Decoding: Bidirectional and reversible processes, allowing conversion between strings and bytes while specifying the desired character encoding.

Understanding and correctly using str.encode() and bytes.decode() ensures proper handling of text and binary data, especially when dealing with different languages and character sets.

Example - Conversion Functions:

Integer Conversion: int(x, base=10) converts a string or a float to an integer. The base parameter specifies the base of the input string (default is 10).

int_from_string = int('42')  # Converts the string '42' to the integer 42
int_from_float = int(132.0)  # Converts the float 132.0 to the integer 132
print("int_from_string:", int_from_string)
print("int_from_float:", int_from_float)

int_from_string: 42
int_from_float: 132

Float Conversion: float(x) converts a string or an integer to a floating-point number.

float_from_string = float('3.14')  # Converts the string '3.14' to the float 3.14
float_from_integer = float(132)    # Converts the integer 132 to the float 132.0
print("float_from_string:", float_from_string)
print("float_from_integer:", float_from_integer)

float_from_string: 3.14
float_from_integer: 132.0

String Conversion: str(x) converts an integer or another data type to its string representation.

integer_to_string = str(42)  # Converts the integer 42 to the string '42'
print("integer_to_string:", integer_to_string)

integer_to_string: 42

Unicode Character Conversion: chr(i) returns a string representing a character whose Unicode code point is the integer i.

unicode_character = chr(65)  # Converts the integer 65 to the character 'A'
print("unicode_character:", unicode_character)

unicode_character: A

Unicode Code Point Conversion: ord(c) returns an integer representing the Unicode code point of the character c.

unicode_code_point = ord('A')  # Converts the character 'A' to the Unicode code point 65
print("unicode_code_point:", unicode_code_point)

unicode_code_point: 65

Boolean Conversion: bool(x) converts a value to a Boolean, where 0, empty sequences, and None are False, and all other values are True.

boolean_from_integer = bool(0)  # Converts the integer 0 to the Boolean False
print("boolean_from_integer:", boolean_from_integer)

boolean_from_integer: False

List Conversion: list(iterable) converts an iterable (e.g., a tuple) to a list.

list_from_tuple = list((1, 2, 3))  # Converts the tuple (1, 2, 3) to the list [1, 2, 3]
print("list_from_tuple:", list_from_tuple)

list_from_tuple: [1, 2, 3]

Tuple Conversion: tuple(iterable) converts an iterable (e.g., a list) to a tuple.

tuple_from_list = tuple([1, 2, 3])  # Converts the list [1, 2, 3] to the tuple (1, 2, 3)
print("tuple_from_list:", tuple_from_list)

tuple_from_list: (1, 2, 3)

Set Conversion: set(iterable) converts an iterable to a set, which is an unordered collection of unique elements.

set_from_list = set([1, 2, 3, 3])  # Converts the list [1, 2, 3, 3] to the set {1, 2, 3}
print("set_from_list:", set_from_list)

set_from_list: {1, 2, 3}

Dictionary Creation: dict() creates a new dictionary with specified key-value pairs.

new_dictionary = dict(a=1, b=2)  # Creates a dictionary with keys 'a' and 'b' and values 1 and 2
print("new_dictionary:", new_dictionary)

new_dictionary: {'a': 1, 'b': 2}

String Encoding: str.encode(encoding='UTF-8', errors='strict') encodes a string into bytes using the specified encoding.

encoded_string = 'hello'.encode('UTF-16')  # Encodes the string 'hello' to UTF-16 bytes
print("encoded_string:", encoded_string)

encoded_string: b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'

Bytes Decoding: bytes.decode(encoding='UTF-8', errors='strict') decodes bytes to a string using the specified encoding.

decoded_bytes = b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'.decode('UTF-16')  # Decodes UTF-16 bytes to the string 'hello'
print("decoded_bytes:", decoded_bytes)

decoded_bytes: hello

Data Types: A brief overview

Contents

1.2. Data Types: A brief overview#

1.2.1. Data Type Conversion Functions#