En matière de codage assisté par IA, OpenAI ChatGPT 5 et Grok 4 sont deux des outils les plus en vogue de 2025. Tous deux prétendent écrire du code Python avec une précision et une efficacité élevées, mais lequel offre réellement des scripts plus propres et plus faciles à maintenir dans des scénarios réels?

Dans cet article, nous allons comparer ChatGPT 5 et Grok 4 dans plusieurs défis Python. Nous examinerons la correction du code, la lisibilité, la vitesse d’exécution et l’adaptabilité face à des instructions complexes.

Pourquoi comparer ChatGPT 5 et Grok 4 pour Python?

Python reste le langage de référence pour l’IA, l’automatisation et la science des données. De nombreux développeurs s’appuient désormais sur l’IA pour accélérer leur flux de travail de codage, déboguer plus rapidement et même générer des applications complètes de A à Z.

Si ChatGPT domine depuis longtemps ce secteur, Grok a gagné en popularité grâce à sa clarté de code et à ses résultats concis. Les deux utilisent des philosophies d’entraînement différentes: ChatGPT met l’accent sur le raisonnement, tandis que Grok privilégie la concision.

Test d’OpenAI ChatGPT 5 et Grok 4

Nous allons comparer les deux IA à travers 10 cas de test de codage Python:

  1. Script de base (Somme de nombres pairs)
  2. Traitement des données (Totaux CSV)
  3. Problème algorithmique (Premier caractère non répétitif)
  4. Intégration d’API (Récupération du cours du Bitcoin)
  5. Scraping web (Titres de Hacker News)
  6. Compression de fichiers (Fichiers .txt zippés)
  7. Traitement d’images (Redimensionnement de fichiers JPEG)
  8. Multithreading (Téléchargements simultanés)
  9. Interaction avec la base de données (CRUD SQLite)
  10. Test unitaire (Test des nombres premiers)

Toutes les invites seront identiques et soumises à ChatGPT 5 et Grok 4. Vous pouvez exécuter les tests vous-même pour valider les résultats.

Défi Python 1 - Script de base

Invite du défi 1:

Écrivez un script Python qui lit une liste d’entiers et génère la somme des nombres pairs.

Code GPT 5 du défi 1:

 1# Script to read a list of integers and output the sum of even numbers
 2def sum_of_even_numbers(numbers):
 3    return sum(num for num in numbers if num % 2 == 0)
 4
 5if __name__ == "__main__":
 6    # Read input from the user
 7    try:
 8        numbers = list(map(int, input("Enter integers separated by spaces: ").split()))
 9        total_even = sum_of_even_numbers(numbers)
10        print("Sum of even numbers:", total_even)
11    except ValueError:
12        print("Please enter valid integers only.")

Défi 1 Code Grok 4:

 1# Read a line of input containing space-separated integers
 2input_line = input("Enter a list of integers separated by spaces: ")
 3
 4# Convert the input string into a list of integers
 5numbers = list(map(int, input_line.split()))
 6
 7# Initialize sum of even numbers
 8even_sum = 0
 9
10# Iterate through the list and add even numbers to the sum
11for num in numbers:
12    if num % 2 == 0:
13        even_sum += num
14
15# Output the sum
16print("Sum of even numbers:", even_sum)

Observations du Défi 1:

  • Vitesse GPT 5: Réponse très rapide, quasi instantanée.
  • Code GPT 5: Code plus court et élégant, avec point d’entrée et commentaires.
  • Vitesse Grok 4: Réponse plus lente, “réflexion” un peu plus longue (environ 2 secondes).
  • Code Grok 4: Code un peu plus long, sans point d’entrée, mais avec commentaires.

Défi Python 2 - Traitement des données

Invite du Défi 2:

Charger un fichier CSV de transactions et afficher le total des ventes par produit.

Code GPT 5 du Défi 2:

 1import csv
 2from collections import defaultdict
 3
 4def total_sales_per_product(csv_file):
 5    sales = defaultdict(float)
 6
 7    with open(csv_file, newline='', encoding='utf-8') as f:
 8        reader = csv.DictReader(f)
 9        for row in reader:
10            try:
11                product = row['product']
12                quantity = float(row['quantity'])
13                price = float(row['price'])
14                sales[product] += quantity * price
15            except (KeyError, ValueError):
16                print(f"Skipping invalid row: {row}")
17
18    return sales
19
20if __name__ == "__main__":
21    file_path = "transactions.csv"  # Change to your file path
22    sales_totals = total_sales_per_product(file_path)
23
24    print("Total Sales Per Product:")
25    for product, total in sales_totals.items():
26        print(f"{product}: ${total:,.2f}")

Défi 2 Code Grok 4:

 1import pandas as pd
 2
 3# Prompt user for the CSV file path
 4file_path = input("Enter the path to the CSV file: ")
 5
 6# Load the CSV file into a DataFrame
 7df = pd.read_csv(file_path)
 8
 9# Assuming the CSV has columns 'Product' and 'Sales'
10# Group by 'Product' and sum the 'Sales'
11total_sales = df.groupby('Product')['Sales'].sum()
12
13# Print the total sales per product
14print("Total sales per product:")
15print(total_sales)

Observations du Défi 2:

  • Vitesse GPT 5: Réponse très rapide, quasi instantanée.
  • Code GPT 5: Code fourni un peu plus long et élégant avec un point d’entrée, mais sans commentaires.
  • Vitesse Grok 4: Réponse plus lente, “réflexion” un peu plus longue (environ 13 secondes).
  • Code Grok 4: Code fourni plus simple et sans point d’entrée, mais avec commentaires.

Défi Python 3 - Problème algorithmique

Invite du Défi 3:

À partir d’une chaîne, renvoyer le premier caractère non répétitif.

Code GPT 5 du Défi 3:

 1def first_non_repeating_char(s):
 2    from collections import Counter
 3    counts = Counter(s)
 4
 5    for char in s:
 6        if counts[char] == 1:
 7            return char
 8    return None  # If no non-repeating character found
 9
10# Example usage
11text = "swiss"
12result = first_non_repeating_char(text)
13if Code:
14    print(f"The first non-repeating character is: '{result}'")
15else:
16    print("No non-repeating character found.")

Défi 3 Code Grok 4:

 1import collections
 2
 3def first_non_repeating_char(s):
 4    """
 5    Returns the first non-repeating character in the string.
 6    If no such character exists, returns None.
 7    """
 8    count = collections.Counter(s)
 9    for char in s:
10        if count[char] == 1:
11            return char
12    return None
13
14# Example usage
15input_string = input("Enter a string: ")
16result = first_non_repeating_char(input_string)
17print("First non-repeating character:", result)

Observations du Défi 3:

  • Vitesse GPT 5: Réponse très rapide, quasi instantanée.
  • Code GPT 5: Code court fourni sans point d’entrée ni commentaires. L’exemple d’utilisation présente un bug.
  • Vitesse Grok 4: Réponse plus lente, “réflexion” un peu plus longue (environ 8 secondes).
  • Code Grok 4: Code court fourni sans point d’entrée, avec une brève docstring.

Défi Python 4 - Intégration API

Invite du Défi 4:

Utilisez la bibliothèque de requêtes pour récupérer le cours actuel du Bitcoin en USD depuis une API publique et l’afficher.

Défi 4 Code GPT 5:

 1import requests
 2
 3def get_bitcoin_price_usd():
 4    url = "https://api.coindesk.com/v1/bpi/currentprice/USD.json"
 5    try:
 6        response = requests.get(url, timeout=5)
 7        response.raise_for_status()  # Raise HTTPError for bad responses
 8        data = response.json()
 9        price = data["bpi"]["USD"]["rate"]
10        print(f"Current Bitcoin Price: ${price} USD")
11    except requests.exceptions.RequestException as e:
12        print(f"Error fetching Bitcoin price: {e}")
13    except (KeyError, ValueError) as e:
14        print(f"Error parsing response: {e}")
15
16if __name__ == "__main__":
17    get_bitcoin_price_usd()

Défi 4 Grok 4 Code:

 1import requests
 2
 3# Define the API endpoint
 4url = "https://api.coingecko.com/api/v3/simple/price?ids=bitcoin&vs_currencies=usd"
 5
 6# Fetch the data
 7response = requests.get(url)
 8
 9# Check if the request was successful
10if response.status_code == 200:
11    data = response.json()
12    bitcoin_price = data['bitcoin']['usd']
13    print(f"Current Bitcoin price in USD: ${bitcoin_price}")
14else:
15    print(f"Error fetching data: {response.status_code}")

Observations du Défi 4:

  • Vitesse GPT 5: Réponse très rapide, quasi instantanée.
  • Code GPT 5: Code court et élégant avec point d’entrée, mais sans commentaires.
  • Vitesse Grok 4: Réponse plus lente, “réflexion” un peu plus longue (environ 23 secondes).
  • Code Grok 4: Code un peu plus court, sans point d’entrée, mais avec commentaires.

Défi Python 5 - Web Scraping

Invite du Défi 5:

Utilisez des requêtes et BeautifulSoup pour extraire les 5 principaux titres de l’actualité de https://news.ycombinator.com/ et les afficher.

Défi 5 Code GPT 5:

 1import requests
 2from bs4 import BeautifulSoup
 3
 4def get_top_hn_headlines(limit=5):
 5    url = "https://news.ycombinator.com/"
 6    try:
 7        response = requests.get(url, timeout=5)
 8        response.raise_for_status()
 9        
10        soup = BeautifulSoup(response.text, "html.parser")
11        headlines = [a.get_text() for a in soup.select(".titleline > a")]
12        
13        print(f"Top {limit} Hacker News Headlines:")
14        for i, title in enumerate(headlines[:limit], 1):
15            print(f"{i}. {title}")
16    except requests.exceptions.RequestException as e:
17        print(f"Error fetching Hacker News: {e}")
18
19if __name__ == "__main__":
20    get_top_hn_headlines()

Défi 5 Code Grok 4:

 1import requests
 2from bs4 import BeautifulSoup
 3
 4# Define the URL
 5url = 'https://news.ycombinator.com/'
 6
 7# Fetch the webpage
 8response = requests.get(url)
 9response.raise_for_status()  # Raise an error for bad status codes
10
11# Parse the HTML content
12soup = BeautifulSoup(response.text, 'html.parser')
13
14# Find the top 5 story elements
15stories = soup.find_all('tr', class_='athing')[:5]
16
17# Print the top 5 headlines
18print("Top 5 News Headlines from Hacker News:")
19for i, story in enumerate(stories, 1):
20    title_span = story.find('span', class_='titleline')
21    if title_span:
22        headline = title_span.find('a').text
23        print(f"{i}. {headline}")

Observations du Défi 5:

  • Vitesse GPT 5: Réponse très rapide, quasi instantanée.
  • Code GPT 5: Code court et élégant avec point d’entrée, mais sans commentaires.
  • Vitesse Grok 4: Réponse plus lente, “réflexion” un peu plus longue (environ 13 secondes).
  • Code Grok 4: Code un peu plus long, sans point d’entrée, mais avec commentaires.

Défi Python 6 - Compression de fichiers

Invite du Défi 6:

Créez un script Python qui compresse tous les fichiers .txt du répertoire courant dans “archive.zip”.

Défi 6 Code GPT 5:

 1import zipfile
 2import glob
 3import os
 4
 5def zip_text_files(output_zip="archive.zip"):
 6    txt_files = glob.glob("*.txt")
 7    if not txt_files:
 8        print("No .txt files found in the current directory.")
 9        return
10    
11    with zipfile.ZipFile(output_zip, "w") as zipf:
12        for file in txt_files:
13            zipf.write(file, os.path.basename(file))
14            print(f"Added: {file}")
15    
16    print(f"Created {output_zip} containing {len(txt_files)} files.")
17
18if __name__ == "__main__":
19    zip_text_files()

Défi 6 Code Grok 4:

 1import os
 2from zipfile import ZipFile
 3
 4# Get the current directory
 5current_dir = os.getcwd()
 6
 7# List all .txt files in the current directory
 8txt_files = [f for f in os.listdir(current_dir) if f.endswith('.txt')]
 9
10# Create a zip file and add the .txt files
11with ZipFile('archive.zip', 'w') as zipf:
12    for file in txt_files:
13        zipf.write(file)
14
15print(f"Successfully zipped {len(txt_files)} .txt files into 'archive.zip'.")

Observations du Défi 6:

  • Vitesse GPT 5: Réponse très rapide, quasi instantanée.
  • Code GPT 5: Code court et élégant fourni avec un point d’entrée, mais sans commentaires.
  • Vitesse Grok 4: Réponse plus lente, “réflexion” un peu plus longue (environ 10 secondes).
  • Code Grok 4: Code court fourni, sans point d’entrée, mais avec commentaires.

Défi Python 7 - Traitement d’images

Invite du Défi 7:

Utilisez Pillow (PIL) pour redimensionner toutes les images JPEG du répertoire courant à 800x600 pixels et les enregistrer dans un dossier de sortie.

Code GPT 5 du Défi 7:

 1import os
 2from PIL import Image
 3import glob
 4
 5def resize_jpeg_images(width=800, height=600, output_dir="output"):
 6    # Create output folder if it doesn't exist
 7    os.makedirs(output_dir, exist_ok=True)
 8
 9    # Find all .jpg and .jpeg files (case-insensitive)
10    jpeg_files = glob.glob("*.jpg") + glob.glob("*.jpeg") + glob.glob("*.JPG") + glob.glob("*.JPEG")
11
12    if not jpeg_files:
13        print("No JPEG images found in the current directory.")
14        return
15
16    for file in jpeg_files:
17        try:
18            with Image.open(file) as img:
19                img_resized = img.resize((width, height), Image.LANCZOS)
20                output_path = os.path.join(output_dir, os.path.basename(file))
21                img_resized.save(output_path, "JPEG")
22                print(f"Resized and saved: {output_path}")
23        except Exception as e:
24            print(f"Error processing {file}: {e}")
25
26if __name__ == "__main__":
27    resize_jpeg_images()

Défi 7 Code Grok 4:

 1import os
 2from PIL import Image
 3
 4# Get the current directory
 5current_dir = os.getcwd()
 6
 7# Create the output folder if it doesn't exist
 8output_folder = os.path.join(current_dir, 'output')
 9os.makedirs(output_folder, exist_ok=True)
10
11# List all JPEG files in the current directory (case-insensitive)
12jpeg_files = [f for f in os.listdir(current_dir) 
13              if f.lower().endswith(('.jpg', '.jpeg'))]
14
15# Resize each image and save to output folder
16resized_count = 0
17for file in jpeg_files:
18    try:
19        img_path = os.path.join(current_dir, file)
20        img = Image.open(img_path)
21        resized_img = img.resize((800, 600))
22        output_path = os.path.join(output_folder, file)
23        resized_img.save(output_path)
24        resized_count += 1
25    except Exception as e:
26        print(f"Error processing {file}: {e}")
27
28print(f"Successfully resized and saved {resized_count} JPEG images to the 'output' folder.")

Challenge 7 Observations:

  • GPT 5 Speed: Very fast response, near instant.
  • GPT 5 Code: Provided short and elegant code with entrypoint, with comments.
  • Grok 4 Speed: Slower response, “thought” a bit longer (~10 seconds).
  • Grok 4 Code: Provided a bit longer code and without entrypoint, with comments.

Python Challenge 8 - Multithreading

Challenge 8 Prompt:

Create a Python script that downloads 5 different files from given URLs concurrently using threading.

Challenge 8 GPT 5 Code:

 1import os
 2from pathlib import Path
 3from urllib.parse import urlparse
 4from concurrent.futures import ThreadPoolExecutor, as_completed
 5
 6import requests
 7
 8CHUNK_SIZE = 1024 * 64  # 64KB
 9
10
11def unique_path(dest_dir: Path, filename: str) -> Path:
12    """Return a unique path inside dest_dir by appending (1), (2), ... if needed."""
13    dest = dest_dir / filename
14    if not dest.exists():
15        return dest
16    stem, suffix = os.path.splitext(filename)
17    i = 1
18    while True:
19        candidate = dest_dir / f"{stem} ({i}){suffix}"
20        if not candidate.exists():
21            return candidate
22        i += 1
23
24
25def filename_from_url(url: str, index: int) -> str:
26    """Get a safe filename from URL; fall back to file_{index} if missing."""
27    path = urlparse(url).path
28    name = os.path.basename(path).strip()
29    return name or f"file_{index}"
30
31
32def download_file(url: str, dest_dir: Path, index: int, timeout: int = 15) -> Path:
33    """Download a single file to dest_dir. Returns the saved path."""
34    dest_dir.mkdir(parents=True, exist_ok=True)
35    base_name = filename_from_url(url, index)
36    dest_path = unique_path(dest_dir, base_name)
37
38    headers = {"User-Agent": "python-threaded-downloader/1.0"}
39    with requests.get(url, stream=True, timeout=timeout, headers=headers) as r:
40        r.raise_for_status()
41        with open(dest_path, "wb") as f:
42            for chunk in r.iter_content(chunk_size=CHUNK_SIZE):
43                if chunk:  # filter out keep-alive chunks
44                    f.write(chunk)
45    return dest_path
46
47
48def download_concurrently(urls, out_dir="downloads", max_workers=5):
49    """Download given URLs concurrently using threads."""
50    dest_dir = Path(out_dir)
51    results = []
52    errors = []
53
54    with ThreadPoolExecutor(max_workers=max_workers) as executor:
55        future_to_url = {
56            executor.submit(download_file, url, dest_dir, i): url
57            for i, url in enumerate(urls, start=1)
58        }
59        for future in as_completed(future_to_url):
60            url = future_to_url[future]
61            try:
62                saved_path = future.result()
63                print(f"✅ Downloaded: {url} -> {saved_path}")
64                results.append(saved_path)
65            except Exception as e:
66                print(f"❌ Failed: {url} ({e})")
67                errors.append((url, e))
68    print(f"\nDone. {len(results)} succeeded, {len(errors)} failed.")
69    return results, errors
70
71
72if __name__ == "__main__":
73    # Replace these with your 5 URLs
74    urls = [
75        "https://example.com/file1.jpg",
76        "https://example.com/file2.pdf",
77        "https://example.com/file3.zip",
78        "https://example.com/file4.png",
79        "https://example.com/file5.mp4",
80    ]
81    download_concurrently(urls, out_dir="downloads", max_workers=5)

Challenge 8 Grok 4 Code:

 1import threading
 2import urllib.request
 3
 4def download_file(url, filename):
 5    """
 6    Downloads a file from the given URL and saves it with the specified filename.
 7    """
 8    try:
 9        urllib.request.urlretrieve(url, filename)
10        print(f"Downloaded {filename} from {url}")
11    except Exception as e:
12        print(f"Error downloading {url}: {e}")
13
14# List of 5 example URLs (replace with your actual URLs)
15urls = [
16    "https://www.example.com/file1.txt",
17    "https://www.example.com/file2.pdf",
18    "https://www.example.com/file3.jpg",
19    "https://www.example.com/file4.zip",
20    "https://www.example.com/file5.docx"
21]
22
23# Corresponding filenames to save the downloads
24filenames = [
25    "file1.txt",
26    "file2.pdf",
27    "file3.jpg",
28    "file4.zip",
29    "file5.docx"
30]
31
32# Create and start threads for concurrent downloads
33threads = []
34for url, filename in zip(urls, filenames):
35    thread = threading.Thread(target=download_file, args=(url, filename))
36    thread.start()
37    threads.append(thread)
38
39# Wait for all threads to complete
40for thread in threads:
41    thread.join()
42
43print("All downloads completed.")

Challenge 8 Observations:

  • GPT 5 Speed: Slower response, “thought” a bit longer (~10 seconds).
  • GPT 5 Code: Provided long and elegant code with entrypoint, with comments.
  • Grok 4 Speed: Even slower response, “thought” a bit longer (~21 seconds).
  • Grok 4 Code: Provided shorter code and without entrypoint, with comments.

Python Challenge 9 - Database Interaction

Challenge 9 Prompt:

Connect to an SQLite database, create a ‘users’ table, insert sample data, and query all users.

Challenge 9 GPT 5 Code:

 1import sqlite3
 2
 3def main():
 4    # Connect to (or create) the database
 5    conn = sqlite3.connect("example.db")
 6    cursor = conn.cursor()
 7
 8    # Create the users table if it doesn't exist
 9    cursor.execute("""
10        CREATE TABLE IF NOT EXISTS users (
11            id INTEGER PRIMARY KEY AUTOINCREMENT,
12            name TEXT NOT NULL,
13            email TEXT UNIQUE NOT NULL
14        )
15    """)
16
17    # Insert sample data
18    sample_users = [
19        ("Alice", "[email protected]"),
20        ("Bob", "[email protected]"),
21        ("Charlie", "[email protected]"),
22    ]
23
24    try:
25        cursor.executemany("INSERT INTO users (name, email) VALUES (?, ?)", sample_users)
26    except sqlite3.IntegrityError:
27        # Ignore duplicates if run multiple times
28        pass
29
30    # Commit changes
31    conn.commit()
32
33    # Query all users
34    cursor.execute("SELECT id, name, email FROM users")
35    users = cursor.fetchall()
36
37    print("Users in database:")
38    for user in users:
39        print(user)
40
41    # Close the connection
42    conn.close()
43
44if __name__ == "__main__":
45    main()

Challenge 9 Grok 4 Code:

 1import sqlite3
 2
 3# Connect to an in-memory SQLite database
 4conn = sqlite3.connect(':memory:')
 5cursor = conn.cursor()
 6
 7# Create the 'users' table
 8cursor.execute('''
 9    CREATE TABLE users (
10        id INTEGER PRIMARY KEY,
11        name TEXT,
12        age INTEGER
13    )
14''')
15
16# Insert sample data
17cursor.execute("INSERT INTO users (name, age) VALUES ('Alice', 30)")
18cursor.execute("INSERT INTO users (name, age) VALUES ('Bob', 25)")
19cursor.execute("INSERT INTO users (name, age) VALUES ('Charlie', 35)")
20
21# Commit the changes
22conn.commit()
23
24# Query all users
25cursor.execute("SELECT * FROM users")
26results = cursor.fetchall()
27
28# Print the results
29print("ID | Name | Age")
30print("----------------")
31for row in results:
32    print(f"{row[0]} | {row[1]} | {row[2]}")
33
34# Close the connection
35conn.close()

Challenge 9 Observations:

  • GPT 5 Speed: Very fast response, near instant. Understood a script is needed, even though we just bluntly said “connect…” instead of “write or create”.
  • GPT 5 Code: Provided long and elegant code with entrypoint, with comments.
  • Grok 4 Speed: Slower response, “thought” a bit longer (~16 seconds).
  • Grok 4 Code: Provided a bit shorter code and, for the first time, with an entrypoint and comments. Very interestingly, several function/test names mirror ChatGPT 5’s style.

Python Challenge 10 - Unit Testing

Challenge 10 Prompt:

Write a Python unittest that verifies a function ‘is_prime(n)’ correctly identifies prime numbers.

Challenge 10 GPT 5 Code:

 1import unittest
 2
 3# Example implementation of is_prime for testing purposes
 4# If you already have this function, you can remove this definition.
 5def is_prime(n):
 6    """Return True if n is a prime number, else False."""
 7    if n <= 1:
 8        return False
 9    if n <= 3:
10        return True
11    if n % 2 == 0 or n % 3 == 0:
12        return False
13    i = 5
14    while i * i <= n:
15        if n % i == 0 or n % (i + 2) == 0:
16            return False
17        i += 6
18    return True
19
20
21class TestIsPrime(unittest.TestCase):
22    def test_primes(self):
23        primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
24        for p in primes:
25            with self.subTest(p=p):
26                self.assertTrue(is_prime(p), f"{p} should be prime")
27
28    def test_non_primes(self):
29        non_primes = [0, 1, 4, 6, 8, 9, 10, 12, 15, 20, 25]
30        for n in non_primes:
31            with self.subTest(n=n):
32                self.assertFalse(is_prime(n), f"{n} should not be prime")
33
34if __name__ == "__main__":
35    unittest.main()

Challenge 10 Grok 4 Code:

 1import unittest
 2
 3# Assuming the is_prime function is defined in a module named prime_checker
 4# Replace 'prime_checker' with the actual module name where is_prime is defined
 5from prime_checker import is_prime
 6
 7class TestIsPrime(unittest.TestCase):
 8    def test_prime_numbers(self):
 9        """Test that prime numbers are correctly identified."""
10        primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
11        for num in primes:
12            with self.subTest(num=num):
13                self.assertTrue(is_prime(num), f"{num} should be prime")
14
15    def test_non_prime_numbers(self):
16        """Test that non-prime numbers are correctly identified."""
17        non_primes = [0, 1, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 25, 26, 27, 28]
18        for num in non_primes:
19            with self.subTest(num=num):
20                self.assertFalse(is_prime(num), f"{num} should not be prime")
21
22    def test_negative_numbers(self):
23        """Test that negative numbers are not considered prime."""
24        negatives = [-1, -2, -3, -5, -7]
25        for num in negatives:
26            with self.subTest(num=num):
27                self.assertFalse(is_prime(num), f"{num} should not be prime")
28
29    def test_one_and_zero(self):
30        """Explicitly test 0 and 1, which are not prime."""
31        self.assertFalse(is_prime(0))
32        self.assertFalse(is_prime(1))
33
34if __name__ == '__main__':
35    unittest.main()

Challenge 10 Observations:

  • GPT 5 Speed: very fast response, near instant.
  • GPT 5 Code: Provided long and elegant code with entrypoint, with comments.
  • Grok 4 Speed: Slower response, “thought” a bit longer (~12 seconds).
  • Grok 4 Code: Provided a bit shorter code and for the first time with entrypoint and comments. Very interestingly the function names are the as ChatGPT 5.

Criteria for Evaluation

We will rate each AI on:

  • Code Correctness: Does it work without major edits?
  • Readability: Is the code clean and well-commented?
  • Efficiency: Does it use optimal methods?
  • Error Handling: Does it anticipate possible failures?
  • Explainability: Does it provide clear reasoning?

Preliminary Observations

From previous and current experience:

  • ChatGPT 5 tends to give more verbose, well-documented code, much faster.
  • Grok 4 prefers minimalism and slower responses and sometimes omits comments.

Both excel at standard tasks, but Grok may struggle with multi-step reasoning prompts.

What stood out across the 10 challenges

  • Prompt adherence: GPT-5 stayed on-task (e.g., BTC API). Grok 4 occasionally drifted (Challenge 4).
  • Entrypoints & structure: GPT-5 consistently used entrypoints and helpers; Grok 4 often wrote single-file scripts without an entrypoint.
  • Error handling: GPT-5 added timeouts/raise_for_status/try-except more often; Grok 4 tended to be minimal.
  • Dependencies & assumptions: GPT-5 used stdlib where possible; Grok 4 leaned on pandas or simpler urllib defaults.
  • Data model assumptions: GPT-5 inferred fields and computed values (qty × price); Grok 4 assumed pre-aggregated columns.
  • Algorithmic care: Both solved the logic tasks; GPT-5’s example had a minor variable bug, while Grok 4’s HN example had syntax typos.
  • Performance posture: GPT-5 used streaming + thread pools for downloads; Grok 4 used raw threads + urlretrieve (simpler, less robust).

Verdict

While both tools can write functional Python code, the choice may come down to developer preference:

  • Choose ChatGPT 5 if you value detailed explanations, step-by-step reasoning, fast code generation and extensive comments.
  • Choose Grok 4 if you prefer concise, simple code with minimal fluff and slower code generation.

I honestly prefer ChatGPT 5 because it responds much faster with better and detailed Python code. Sorry Elon Musk.

Frequently Asked Questions (FAQ)

Is OpenAI ChatGPT 5 or Grok 4 better for beginners? ChatGPT 5. It explains more, includes safer defaults (timeouts, error handling), and uses cleaner structure.

Which produced fewer code issues in these tests? ChatGPT 5 overall. Grok 4 had occasional prompt drift and minor syntax errors in scraping.

Which is faster? In your runs, ChatGPT 5 responded faster on average. Your timings are included per challenge.

Do I need to review the code they generate? Yes. Both models can make small mistakes; always run tests and add guardrails for I/O and network code.

Which handled files, images, and networking more robustly? ChatGPT 5. It tended to add entrypoints, timeouts, streaming, and better image resampling.

Does Grok 4 have advantages? Yes, snappier, concise scripts when you already know the context and want minimal output.

What prompt style worked best? Be explicit about inputs/outputs, libraries, and edge cases (example: ‘use requests with timeout, print JSON parse errors’).

Can I rely on either model for production code? Use them as accelerators, not replacements: keep tests, linting, and security reviews in your pipeline.