Day 03: Strings — Manodemy

🎯 Enterprise Objective

Strings are the primary vehicle for text data — the most common data type in real-world datasets. Today we master every aspect of string manipulation, from basic creation to performance optimization. You will learn not just string methods, but professional text processing patterns used in ETL pipelines and data cleaning.

📋 Strategic Overview

#	Topic	Key Methods	Core Use Case
1	Creation & Escaping	`''`, `""`, `r''`, `'''`	File paths, SQL, multi-line
2	Indexing & Slicing	`s[0]`, `s[1:4]`, `s[::-1]`	Data extraction
3	Core Methods	`.strip()`, `.replace()`, `.find()`	Data cleaning
4	Split & Join	`.split()`, `','.join()`	CSV/text parsing
5	Formatting	f-strings, `.format()`	Reports, dashboards
6	Validation	`.isdigit()`, `.isalpha()`	Input validation
7	Encoding	`.encode()`, `.decode()`	APIs, file I/O
8	Performance	`join()` vs `+=`	Efficient processing

1. String Creation & Escaping : Text Data Fundamentals

🔍 What is it?

A string is an immutable sequence of Unicode characters. You can create strings with single quotes '...', double quotes "...", or triple quotes '''...''' for multi-line text. Raw strings r'...' disable escape character processing.

Syntax	Use Case	Example
`'hello'`	Simple text	Most common
`"it's"`	Text with apostrophes	Avoids escaping
`"""..."""`	Multi-line / docstrings	SQL queries, docs
`r'C:\path'`	Raw string (no escapes)	File paths, regex

Common Escape Characters:

Escape	Meaning	Example
`\n`	Newline	`'Line1\nLine2'`
`\t`	Tab	`'Col1\tCol2'`
`\\`	Backslash	`'C:\\Users'`
`\'`	Single quote	`'it\'s'`

💼 Why Data Analysts Care

• CSV/JSON parsing: Understanding escape characters is essential for parsing data files

• SQL queries: Triple-quoted strings hold multi-line SQL cleanly

• File paths: Raw strings r'C:\data\file.csv' prevent escape issues on Windows

⚠️ Immutability

Strings cannot be modified in place. Every operation like .upper() or + creates a new string object. This matters for performance in loops.

🧠 Pro Tip

Use triple-quoted strings for multi-line SQL queries: query = """SELECT * FROM users WHERE age > 18""". This keeps your code readable.

In [ ]:

🧪 Concept Checks: `Creation`

Q1. Create strings using all 4 methods: single quotes, double quotes, triple quotes, and raw string. Print each with its type() and len().

In [ ]:

Q2. Write a string containing: a newline, a tab, and a backslash. Print it and then print its repr() to see the escape characters.

In [ ]:

Q3. Prove string immutability: create s = "hello", save id(s), then do s += " world". Compare the IDs. What does this prove?

In [ ]:

Q4. Create a raw string for the Windows path C:\Users\Admin\Documents\data.csv. Then create the same path using escape characters. Verify they are equal.

In [ ]:

Q5. Write a multi-line SQL query using triple quotes: SELECT name, age FROM users WHERE age > 18 ORDER BY name. Print it.

In [ ]:

2. Indexing & Slicing : Precision Text Extraction

🔍 What is it?

Strings support zero-based indexing and negative indexing (from the end). Slicing extracts substrings using the syntax s[start:stop:step] where stop is exclusive.

text = 'PYTHON'
#       P  Y  T  H  O  N
#       0  1  2  3  4  5   (positive index)
#      -6 -5 -4 -3 -2 -1  (negative index)

Operation	Syntax	Result
First char	`s[0]`	`'P'`
Last char	`s[-1]`	`'N'`
Slice	`s[1:4]`	`'YTH'`
Reverse	`s[::-1]`	`'NOHTYP'`
Every 2nd	`s[::2]`	`'PTO'`

💼 Why Data Analysts Care

• Column extraction: Extract substrings from fixed-width data: record[0:10] for name field

• Data cleaning: phone[-10:] to extract last 10 digits of phone numbers

• Log parsing: Slice timestamps from log lines: log_line[:19] for ISO datetime

⚠️ Off-by-One

The stop index is exclusive: 'PYTHON'[0:3] gives 'PYT' (indices 0, 1, 2), not 'PYTH'. This is a common source of bugs.

🧠 Pro Tip

Reverse a string with s[::-1]. Check for palindromes: s == s[::-1].

In [ ]:

🧪 Concept Checks: `Indexing & Slicing`

Q1. Given s = "Data Analytics", extract: first word, last word, every other character, and the reversed string. Print each.

In [ ]:

Q2. Given iso_date = "2024-12-25T14:30:00", use slicing to extract: year, month, day, hour, minute, second. Print a formatted result.

In [ ]:

Q3. Write code to check if word = "racecar" is a palindrome using slicing. Print the result with an explanation.

In [ ]:

Q4. Given phone = "+91-98765-43210", use slicing to extract just the 10-digit number (last 10 characters). Print it.

In [ ]:

Q5. Write code that reverses each word in sentence = "Hello World Python" but keeps word order. Use split() and slicing.

In [ ]:

3. Core String Methods : Search, Replace & Transform

🔍 What is it?

Python strings have 40+ built-in methods. The most important ones for data work are: upper(), lower(), strip(), replace(), find(), count(), startswith(), and endswith(). All return new strings (immutability).

Method	Purpose	Example
`.upper()` / `.lower()`	Case conversion	Standardize text
`.strip()`	Remove whitespace	Clean user input
`.replace(old, new)`	Replace substrings	Data correction
`.find(sub)`	Find position (-1 if absent)	Safe search
`.count(sub)`	Count occurrences	Frequency analysis
`.startswith()`	Check prefix	Filter by pattern
`.endswith()`	Check suffix	File type detection

💼 Why Data Analysts Care

• Data standardization: name.strip().title() — clean and capitalize names

• Text cleaning: text.replace('\n', ' ').strip() — normalize whitespace

• File filtering: if filename.endswith('.csv'): — filter file types

• Search: .find() returns -1 instead of raising error (safer than .index())

⚠️ find() vs index()

.find() returns -1 if not found. .index() raises ValueError. Always prefer .find() for safe searching.

🧠 Pro Tip

Chain methods for clean data pipelines: name.strip().lower().replace(' ', '_') converts ' John Doe ' to 'john_doe'.

In [ ]:

🧪 Concept Checks: `String Methods`

Q1. Given name = " alice SMITH ", clean it to produce "Alice Smith" using method chaining (strip, title). Print the result.

In [ ]:

Q2. Given csv_line = "John,25,Engineer,NYC", use .find() to locate the position of the second comma. Print the result.

In [ ]:

Q3. Count how many times the word "data" appears in text = "Data science uses data to derive data-driven insights" (case-insensitive). Print the count.

In [ ]:

Q4. Given a list of filenames ["report.csv", "image.png", "data.csv", "notes.txt"], use .endswith() to filter only .csv files. Print the result.

In [ ]:

Q5. Write code that replaces all spaces in "hello world python" with underscores, then converts to uppercase. Chain the methods in one line.

In [ ]:

4. Splitting & Joining : Text Decomposition & Assembly

🔍 What is it?
split() breaks a string into a list of substrings based on a delimiter. join() does the reverse — it combines a list of strings into one string with a separator. These two methods are the backbone of text data processing.

# split: string → list
'a,b,c'.split(',')        # ['a', 'b', 'c']

# join: list → string
','.join(['a', 'b', 'c'])  # 'a,b,c'

💼 Why Data Analysts Care

• CSV parsing: row.split(',') — manual CSV field extraction

• Log analysis: log_line.split() — split on whitespace for field extraction

• Data export: ','.join(columns) — build CSV rows for output

• Path building: '/'.join(['home', 'user', 'data']) — construct file paths

⚠️ split() with No Arguments

'a b c'.split() splits on any whitespace and removes empty strings. 'a b c'.split(' ') splits on single space only, producing ['a', '', 'b', '', 'c'].

🧠 Pro Tip

Use splitlines() for multi-line text: it handles \n, \r\n, and \r correctly across all platforms.

In [ ]:

🧪 Concept Checks: `Split & Join`

Q1. Given csv = "name,age,city,salary", split into a list, then rejoin with " | " as separator. Print both results.

In [ ]:

Q2. Given path = "/home/user/data/file.csv", split by "/" and extract just the filename. Print it.

In [ ]:

Q3. Split text = "one two three four" using both .split() and .split(" "). Print both results and explain the difference.

In [ ]:

Q4. Given words = ["SELECT", "name", "FROM", "users"], join them with spaces to build a SQL query string. Print it.

In [ ]:

Q5. Write code that reads a multi-line string (use triple quotes with 3 lines) and splits it into individual lines using .splitlines(). Print each line with its index.

In [ ]:

5. String Formatting : Professional Output Generation

🔍 What is it?

Python offers three formatting approaches: f-strings (Python 3.6+, fastest and most readable), .format(), and %-formatting (legacy). F-strings embed expressions directly inside {} braces.

Format Spec	Meaning	Example	Result
`:.2f`	2 decimal places	`f'{3.14159:.2f}'`	`3.14`
`:,`	Thousands separator	`f'{1000000:,}'`	`1,000,000`
`:>10`	Right-align (width 10)	`f'{"hi":>10}'`	`' hi'`
`:<10`	Left-align	`f'{"hi":<10}'`	`'hi '`
`:^10`	Center-align	`f'{"hi":^10}'`	`' hi '`
`:.2%`	Percentage	`f'{0.856:.2%}'`	`85.60%`

💼 Why Data Analysts Care

• Report generation: Formatted tables, aligned columns, currency values

• Logging: f'[{timestamp}] {level}: {message}' — structured log output

• Dashboard metrics: f'{revenue:,.2f}' — professional number formatting

🧠 Pro Tip

F-strings can contain any Python expression: f'{len(data):,} records processed in {elapsed:.1f}s'. They are evaluated at runtime.

In [ ]:

🧪 Concept Checks: `Formatting`

Q1. Given revenue = 1234567.89, print it as currency with commas and 2 decimal places: $1,234,567.89. Use an f-string.

In [ ]:

Q2. Create a formatted table: print 3 products with name (left-aligned, 15 chars) and price (right-aligned, 8 chars, 2 decimals). Use f-string alignment.

In [ ]:

Q3. Given ratio = 0.8567, print it as a percentage with 1 decimal place: 85.7%. Use the % format specifier.

In [ ]:

Q4. Print the number 42 in binary, octal, and hexadecimal using f-string format specs (:b, :o, :x). Print all three.

In [ ]:

Q5. Write an f-string that embeds a conditional expression: print "Even" or "Odd" for n = 7 directly inside the f-string.

In [ ]:

6. String Validation Methods : Data Quality Checks

🔍 What is it?

Python strings have built-in validation methods that return True or False. These are essential for input validation and data quality checks before processing.

Method	Returns True if...	Example
`.isdigit()`	All characters are digits	`'123'.isdigit()` → `True`
`.isalpha()`	All characters are letters	`'abc'.isalpha()` → `True`
`.isalnum()`	Letters or digits only	`'abc123'.isalnum()` → `True`
`.isspace()`	All whitespace	`' '.isspace()` → `True`
`.isupper()`	All uppercase	`'ABC'.isupper()` → `True`
`.islower()`	All lowercase	`'abc'.islower()` → `True`
`.istitle()`	Title case	`'Hello World'.istitle()` → `True`

💼 Why Data Analysts Care

• Input validation: if user_id.isdigit(): — validate before int conversion

• Data cleaning: Filter rows where a column should be numeric but contains text

• ETL pipelines: Validate data quality before loading into databases

⚠️ isdigit() vs isnumeric()

isdigit() only matches 0-9. isnumeric() also matches Unicode numerals like '\u00B2' (superscript 2). For data work, use isdigit() or try/except with int().

In [ ]:

🧪 Concept Checks: `Validation`

Q1. Given a list inputs = ["123", "12.5", "abc", "45", ""], use .isdigit() to filter only valid integers. Print the valid ones.

In [ ]:

Q2. Write a function validate_username(name) that returns True only if: length 3-20, alphanumeric only. Test with 5 examples.

In [ ]:

Q3. Given data = ["Hello", "WORLD", "mixedCase", "Title Case"], classify each as upper, lower, title, or mixed. Use .isupper(), .islower(), .istitle().

In [ ]:

Q4. Write code that checks if s = " \t\n " is all whitespace using .isspace(). Then check "" (empty string). What does empty return? Explain.

In [ ]:

Q5. Write a safe to_float(s) function that handles strings like "3.14", "-2.5", "abc", "". Return None for invalid inputs. Test with 5 cases.

In [ ]:

7. Encoding & Unicode : Global Text Processing

🔍 What is it?

Python 3 strings are Unicode by default (UTF-8). When working with files, APIs, or databases, you must handle encoding correctly. encode() converts str → bytes, decode() converts bytes → str.

text = 'Hello'
bytes_obj = text.encode('utf-8')   # b'Hello'
back = bytes_obj.decode('utf-8')   # 'Hello'

💼 Why Data Analysts Care

• API responses: JSON/REST APIs often return bytes that need decoding

• File I/O: open(file, encoding='utf-8') — always specify encoding

• International data: Names, addresses, currencies in non-Latin scripts need proper Unicode handling

⚠️ UnicodeDecodeError

Reading a file with wrong encoding causes UnicodeDecodeError. Always use encoding='utf-8' or detect encoding with libraries like chardet.

In [ ]:

🧪 Concept Checks: `Encoding`

Q1. Encode text = "Python" to UTF-8 bytes. Print the bytes object and its length. Then decode it back and verify equality.

In [ ]:

Q2. Compare the byte length of "A" vs "\u00C9" (accented E) vs a Chinese character "\u4e16" in UTF-8. Print each character and its byte count.

In [ ]:

Q3. Use ord() to print the Unicode code point of each character in "Hello". Then use chr() to reconstruct the string from code points.

In [ ]:

Q4. Write code that safely reads a string, trying UTF-8 first, then Latin-1 as fallback. Use try/except with .decode().

In [ ]:

Q5. Create a string with mixed scripts: English, numbers, and symbols. Print its len() (characters) and len(s.encode()) (bytes). Explain the difference.

In [ ]:

8. String Performance : Efficient Text Processing

🔍 What is it?

Since strings are immutable, concatenation in loops creates many temporary objects. For building large strings, use list + join or io.StringIO instead of +=. This can be 100x faster for large datasets.

Approach	Speed	Memory	Use When
`+=` in loop	Slow	High	Never for large data
`''.join(list)`	Fast	Low	Building strings in loops
f-strings	Fastest	Low	Single-line formatting
`io.StringIO`	Fast	Medium	Stream-like building

💼 Why Data Analysts Care

• ETL pipelines: Building CSV output with join() instead of += saves minutes on large datasets

• Report generation: Use join() for assembling multi-line reports

• Memory management: Knowing string interning helps debug identity issues

🧠 Pro Tip

Python interns small strings and identifiers. 'hello' is 'hello' may be True due to caching, but never rely on this — always use == for comparison.

In [ ]:

🧪 Concept Checks: `Performance`

Q1. Build a string of numbers "0,1,2,...,999" using: (a) += in a loop, (b) ",".join(). Time both approaches and print the speedup ratio.

In [ ]:

Q2. Demonstrate string interning: test a = "hello"; b = "hello"; print(a is b). Then test with a = "hello world". Explain the difference.

In [ ]:

Q3. Write code that builds a CSV string from data = [("Alice",25), ("Bob",30), ("Charlie",35)] using join(). Print the result.

In [ ]:

Q4. Use sys.getsizeof() to measure memory of: empty string, "a", "hello", "a"*1000. Print each size. What pattern do you notice?

In [ ]:

Q5. Write a function build_report(rows) that takes a list of dicts and returns a formatted table string using join(). Test with 3 sample rows.

In [ ]:

🛠️ Professional Practice Tasks

Theory is useless without muscle memory. Complete these tasks to solidify your understanding.

Task 1 (Data Cleaner): Write a function clean_name(name) that: strips whitespace, converts to title case, replaces multiple spaces with single space, and removes non-alphabetic characters (except spaces). Test with ' john DOE 3rd '.

In [ ]:

Task 2 (CSV Parser): Write a function parse_csv_line(line) that splits a CSV line by commas, strips each field, and returns a list. Handle edge case: fields containing commas inside quotes. Test with 'Alice, 28, "New York, NY"'.

In [ ]:

Task 3 (Log Analyzer): Given log = "2024-01-15 14:30:22 ERROR Database connection failed", extract: date, time, level, message using string methods only (no regex). Print each part.

In [ ]:

Task 4 (Email Validator): Write a function validate_email(email) that checks: contains exactly one @, has text before and after @, domain has a dot, no spaces. Return True/False. Test with 5 valid and 5 invalid emails.

In [ ]:

Task 5 (Text Statistics): Write a function text_stats(text) that returns a dict with: character count, word count, sentence count, average word length, most common word. Test with a paragraph of text.

In [ ]:

💻 Pure Coding Interview Questions

Q1.

Write a function `reverse_words(s)` that reverses word order: `'hello world'` → `'world hello'`. Do NOT reverse individual characters.

In [ ]:

Q2.

Write a function `is_anagram(s1, s2)` that checks if two strings are anagrams (case-insensitive, ignoring spaces). Test with `'listen'` and `'silent'`.

In [ ]:

Q3.

Write a function `compress(s)` implementing run-length encoding: `'aabcccdd'` → `'a2b1c3d2'`. Only compress if result is shorter.

In [ ]:

Q4.

Write a function `first_non_repeating(s)` that finds the first non-repeating character. `'aabbc'` → `'c'`. Return `None` if all repeat.

In [ ]:

Q5.

Write a function `caesar_cipher(text, shift)` that shifts each letter by `shift` positions. Handle wrapping (z→a) and preserve non-letters.

In [ ]:

Q6.

Write a function `longest_common_prefix(strs)` that finds the longest common prefix in a list of strings. `['flower','flow','flight']` → `'fl'`.

In [ ]:

Q7.

Write a function `valid_parentheses(s)` that checks if brackets are balanced: `'([{}])'` → `True`, `'([)]'` → `False`.

In [ ]:

Q8.

Write a function `count_vowels(s)` that returns a dict of vowel frequencies (case-insensitive). Test with a sentence.

In [ ]:

Q9.

Write a function `title_case(s)` that capitalizes the first letter of each word, except articles (`a, an, the`). First word always capitalized.

In [ ]:

Q10.

Write a function `remove_duplicates(s)` that removes duplicate characters preserving order: `'abcabc'` → `'abc'`.

In [ ]:

Q11.

Write a function `zigzag(s, rows)` that converts text to zigzag pattern and reads row by row. `'PAYPALISHIRING'` with 3 rows → `'PAHNAPLSIIGYIR'`.

In [ ]:

Q12.

Write a function `word_pattern(pattern, s)` that checks if string follows pattern: `pattern='abba', s='dog cat cat dog'` → `True`.

In [ ]:

Q13.

Write a function `group_anagrams(words)` that groups anagrams together. `['eat','tea','tan','ate','nat','bat']` → grouped lists.

In [ ]:

Q14.

Write a function `find_overlapping(s, sub)` that counts all overlapping occurrences of `sub` in `s`. E.g., `find_overlapping('aaa', 'aa')` returns `2`.

In [ ]:

Q15.

Write a function `pad_number(n, width)` that pads a number with leading zeros to the given width. E.g., `pad_number(42, 5)` returns `'00042'`.

In [ ]:

Q16.

Write code to implement `str.replace()` from scratch: `my_replace(text, old, new)`. Handle overlapping patterns.

In [ ]:

Q17.

Write a function `repeat_chars(s, n)` that repeats each character n times: `repeat_chars('abc', 3)` returns `'aaabbbccc'`.

In [ ]:

Q18.

Write a function `longest_palindrome_substring(s)` that finds the longest palindromic substring in a string.

In [ ]:

Q19.

Write a function `atoi(s)` that converts string to integer handling: whitespace, signs, overflow, invalid chars. Mimic `int()` behavior.

In [ ]:

Q20.

Write a function `justify_text(text, width)` that fully justifies text to given width by distributing spaces evenly between words.

In [ ]:

Q21.

Write a function `compare_version(v1, v2)` that compares version strings: `'1.2.3'` vs `'1.2.4'` → `-1`. Handle different lengths.

In [ ]:

Q22.

Write a function `interleave(s1, s2)` that interleaves two strings: `'abc','xyz'` → `'axbycz'`. Handle different lengths.

In [ ]:

Q23.

Write a function `count_substrings(s, sub)` that counts overlapping occurrences: `'aaa'` contains `'aa'` twice.

In [ ]:

Q24.

Write a function `to_snake_case(s)` converting `'camelCaseString'` → `'camel_case_string'`. Handle consecutive capitals.

In [ ]:

Q25.

Write a function `expand_range(s)` that expands: `'1-5,8,11-14'` → `[1,2,3,4,5,8,11,12,13,14]`.

In [ ]:

📊 Day 3 Executive Summary

#	Topic	Key Takeaway	Professional Application
1	Creation	4 ways to create; strings are immutable	File paths, SQL queries
2	Indexing & Slicing	Zero-based; `stop` is exclusive; `[::-1]` reverses	Log parsing, data extraction
3	Core Methods	`.strip()`, `.replace()`, `.find()` — chain them	Data cleaning pipelines
4	Split & Join	`split()` → list; `join()` → string	CSV/text parsing
5	Formatting	f-strings are fastest and most readable	Reports, dashboards
6	Validation	`.isdigit()`, `.isalpha()` for quality checks	Input validation, ETL
7	Encoding	UTF-8 default; `encode()`/`decode()` for bytes	APIs, file I/O
8	Performance	`join()` >> `+=` for loop concatenation	Large-scale text processing

✅ Instructor's End-of-Day Checklist

• [ ] I understand string immutability and its performance implications.

• [ ] I can use slicing to extract substrings efficiently.

• [ ] I know the difference between .find() (safe) and .index() (raises error).

• [ ] I can use f-strings with format specs for professional output.

• [ ] I understand encoding and can handle UTF-8/bytes conversion.

• [ ] I have completed all 5 practice tasks.

• [ ] I have reviewed all 25 interview questions.

🚀 Continue to Day 4 - Lists: Dynamic Sequences, Manipulation, and Data Pipelines

📊 Day 03 : Strings

🎯 Enterprise Objective

📋 Strategic Overview

1. String Creation & Escaping : Text Data Fundamentals

💼 Why Data Analysts Care

⚠️ Immutability

🧠 Pro Tip

🧪 Concept Checks: Creation

2. Indexing & Slicing : Precision Text Extraction

💼 Why Data Analysts Care

⚠️ Off-by-One

🧠 Pro Tip

🧪 Concept Checks: Indexing & Slicing

3. Core String Methods : Search, Replace & Transform

💼 Why Data Analysts Care

⚠️ find() vs index()

🧠 Pro Tip

🧪 Concept Checks: String Methods

4. Splitting & Joining : Text Decomposition & Assembly

💼 Why Data Analysts Care

⚠️ split() with No Arguments

🧠 Pro Tip

🧪 Concept Checks: Split & Join

5. String Formatting : Professional Output Generation

💼 Why Data Analysts Care

🧠 Pro Tip

🧪 Concept Checks: Formatting

6. String Validation Methods : Data Quality Checks

💼 Why Data Analysts Care

⚠️ isdigit() vs isnumeric()

🧪 Concept Checks: Validation

7. Encoding & Unicode : Global Text Processing

💼 Why Data Analysts Care

⚠️ UnicodeDecodeError

🧪 Concept Checks: Encoding

8. String Performance : Efficient Text Processing

💼 Why Data Analysts Care

🧠 Pro Tip

🧪 Concept Checks: Performance

🛠️ Professional Practice Tasks

💻 Pure Coding Interview Questions

Write a function reverse_words(s) that reverses word order: 'hello world' → 'world hello'. Do NOT reverse individual characters.

Write a function is_anagram(s1, s2) that checks if two strings are anagrams (case-insensitive, ignoring spaces). Test with 'listen' and 'silent'.

Write a function compress(s) implementing run-length encoding: 'aabcccdd' → 'a2b1c3d2'. Only compress if result is shorter.

Write a function first_non_repeating(s) that finds the first non-repeating character. 'aabbc' → 'c'. Return None if all repeat.

Write a function caesar_cipher(text, shift) that shifts each letter by shift positions. Handle wrapping (z→a) and preserve non-letters.

Write a function longest_common_prefix(strs) that finds the longest common prefix in a list of strings. ['flower','flow','flight'] → 'fl'.

Write a function valid_parentheses(s) that checks if brackets are balanced: '([{}])' → True, '([)]' → False.

Write a function count_vowels(s) that returns a dict of vowel frequencies (case-insensitive). Test with a sentence.

Write a function title_case(s) that capitalizes the first letter of each word, except articles (a, an, the). First word always capitalized.

Write a function remove_duplicates(s) that removes duplicate characters preserving order: 'abcabc' → 'abc'.

Write a function zigzag(s, rows) that converts text to zigzag pattern and reads row by row. 'PAYPALISHIRING' with 3 rows → 'PAHNAPLSIIGYIR'.

Write a function word_pattern(pattern, s) that checks if string follows pattern: pattern='abba', s='dog cat cat dog' → True.

Write a function group_anagrams(words) that groups anagrams together. ['eat','tea','tan','ate','nat','bat'] → grouped lists.

Write a function find_overlapping(s, sub) that counts all overlapping occurrences of sub in s. E.g., find_overlapping('aaa', 'aa') returns 2.

Write a function pad_number(n, width) that pads a number with leading zeros to the given width. E.g., pad_number(42, 5) returns '00042'.

Write code to implement str.replace() from scratch: my_replace(text, old, new). Handle overlapping patterns.

Write a function repeat_chars(s, n) that repeats each character n times: repeat_chars('abc', 3) returns 'aaabbbccc'.

Write a function longest_palindrome_substring(s) that finds the longest palindromic substring in a string.

Write a function atoi(s) that converts string to integer handling: whitespace, signs, overflow, invalid chars. Mimic int() behavior.

Write a function justify_text(text, width) that fully justifies text to given width by distributing spaces evenly between words.

Write a function compare_version(v1, v2) that compares version strings: '1.2.3' vs '1.2.4' → -1. Handle different lengths.

Write a function interleave(s1, s2) that interleaves two strings: 'abc','xyz' → 'axbycz'. Handle different lengths.

Write a function count_substrings(s, sub) that counts overlapping occurrences: 'aaa' contains 'aa' twice.

Write a function to_snake_case(s) converting 'camelCaseString' → 'camel_case_string'. Handle consecutive capitals.

Write a function expand_range(s) that expands: '1-5,8,11-14' → [1,2,3,4,5,8,11,12,13,14].

📊 Day 3 Executive Summary

✅ Instructor's End-of-Day Checklist

🧪 Concept Checks: `Creation`

🧪 Concept Checks: `Indexing & Slicing`

🧪 Concept Checks: `String Methods`

🧪 Concept Checks: `Split & Join`

🧪 Concept Checks: `Formatting`

🧪 Concept Checks: `Validation`

🧪 Concept Checks: `Encoding`

🧪 Concept Checks: `Performance`

Write a function `reverse_words(s)` that reverses word order: `'hello world'` → `'world hello'`. Do NOT reverse individual characters.

Write a function `is_anagram(s1, s2)` that checks if two strings are anagrams (case-insensitive, ignoring spaces). Test with `'listen'` and `'silent'`.

Write a function `compress(s)` implementing run-length encoding: `'aabcccdd'` → `'a2b1c3d2'`. Only compress if result is shorter.

Write a function `first_non_repeating(s)` that finds the first non-repeating character. `'aabbc'` → `'c'`. Return `None` if all repeat.

Write a function `caesar_cipher(text, shift)` that shifts each letter by `shift` positions. Handle wrapping (z→a) and preserve non-letters.

Write a function `longest_common_prefix(strs)` that finds the longest common prefix in a list of strings. `['flower','flow','flight']` → `'fl'`.

Write a function `valid_parentheses(s)` that checks if brackets are balanced: `'([{}])'` → `True`, `'([)]'` → `False`.

Write a function `count_vowels(s)` that returns a dict of vowel frequencies (case-insensitive). Test with a sentence.

Write a function `title_case(s)` that capitalizes the first letter of each word, except articles (`a, an, the`). First word always capitalized.

Write a function `remove_duplicates(s)` that removes duplicate characters preserving order: `'abcabc'` → `'abc'`.

Write a function `zigzag(s, rows)` that converts text to zigzag pattern and reads row by row. `'PAYPALISHIRING'` with 3 rows → `'PAHNAPLSIIGYIR'`.

Write a function `word_pattern(pattern, s)` that checks if string follows pattern: `pattern='abba', s='dog cat cat dog'` → `True`.

Write a function `group_anagrams(words)` that groups anagrams together. `['eat','tea','tan','ate','nat','bat']` → grouped lists.

Write a function `find_overlapping(s, sub)` that counts all overlapping occurrences of `sub` in `s`. E.g., `find_overlapping('aaa', 'aa')` returns `2`.

Write a function `pad_number(n, width)` that pads a number with leading zeros to the given width. E.g., `pad_number(42, 5)` returns `'00042'`.

Write code to implement `str.replace()` from scratch: `my_replace(text, old, new)`. Handle overlapping patterns.

Write a function `repeat_chars(s, n)` that repeats each character n times: `repeat_chars('abc', 3)` returns `'aaabbbccc'`.

Write a function `longest_palindrome_substring(s)` that finds the longest palindromic substring in a string.

Write a function `atoi(s)` that converts string to integer handling: whitespace, signs, overflow, invalid chars. Mimic `int()` behavior.

Write a function `justify_text(text, width)` that fully justifies text to given width by distributing spaces evenly between words.

Write a function `compare_version(v1, v2)` that compares version strings: `'1.2.3'` vs `'1.2.4'` → `-1`. Handle different lengths.

Write a function `interleave(s1, s2)` that interleaves two strings: `'abc','xyz'` → `'axbycz'`. Handle different lengths.

Write a function `count_substrings(s, sub)` that counts overlapping occurrences: `'aaa'` contains `'aa'` twice.

Write a function `to_snake_case(s)` converting `'camelCaseString'` → `'camel_case_string'`. Handle consecutive capitals.

Write a function `expand_range(s)` that expands: `'1-5,8,11-14'` → `[1,2,3,4,5,8,11,12,13,14]`.