Using the ORDERS and ORDER_ITEMS datasets, analyze monthly sales data for the past 3 years. Identify key growth trends (YoY, MoM), seasonality patterns (e.g., holiday spikes, summer dips), and top-performing products by revenue and quantity sold. The output should be a report or notebook including: 1) Line charts visualizing sales trends over time. 2) Bar charts for top product performance. 3) A summary of identified seasonality. 4) A sales forecast for the next 6 months using a suitable time series model (e.g., SARIMA or Prophet), presented with a 95% confidence interval and a brief explanation of the model choice and key assumptions.

May 18, 11:25 PM
Vega

I'll help you analyze the monthly sales data from the ORDERS and ORDER_ITEMS datasets. Let me first explore the available data sources to understand what we're working with.

Database Tables
nova_glow
0 tables

Great! I can see the ORDERS and ORDER_ITEMS tables in the nova_glow database. Let's examine the structure of these tables to understand what data we have available.

Database: 2 tables analyzed
orders10 rows
ColumnTypeNullablePrimary
ORDER_IDVARCHAR(16777216)NoNo
CUSTOMER_IDVARCHAR(16777216)NoNo
ORDER_DATETIMESTAMP_NTZ(9)NoNo
ORDER_STATUSVARCHAR(16777216)NoNo
SHIPPING_STREET_ADDRESSVARCHAR(16777216)NoNo
SHIPPING_CITYVARCHAR(16777216)NoNo
SHIPPING_STATEVARCHAR(16777216)NoNo
SHIPPING_ZIP_CODEVARCHAR(16777216)NoNo
SHIPPING_METHODVARCHAR(16777216)NoNo
PAYMENT_METHOD_USEDVARCHAR(16777216)NoNo
PROMO_CODE_USEDVARCHAR(16777216)NoNo
SUBTOTAL_AMOUNTFLOATNoNo
DISCOUNT_AMOUNTFLOATNoNo
SHIPPING_COSTFLOATNoNo
TAX_AMOUNTFLOATNoNo
TOTAL_ORDER_AMOUNTFLOATNoNo
order_items10 rows
ColumnTypeNullablePrimary
ORDER_ITEM_IDVARCHAR(16777216)NoNo
ORDER_IDVARCHAR(16777216)NoNo
PRODUCT_IDVARCHAR(16777216)NoNo
QUANTITYNUMBER(38,0)NoNo
PRICE_PER_UNIT_AT_PURCHASEFLOATNoNo
TOTAL_ITEM_PRICEFLOATNoNo

Now, let's also check the products table since we'll need it to identify top-performing products:

Table: products
products10 rows
ColumnTypeNullablePrimary
PRODUCT_IDVARCHAR(16777216)NoNo
PRODUCT_NAMEVARCHAR(16777216)NoNo
CATEGORYVARCHAR(16777216)NoNo
SUPPLIER_NAMEVARCHAR(16777216)NoNo
COST_PRICEFLOATNoNo
SELLING_PRICEFLOATNoNo
DESCRIPTIONVARCHAR(16777216)NoNo
DATE_ADDEDDATENoNo

Let's get a sample of the data to understand the date ranges and format:

SQL Query Results
SELECT MIN(ORDER_DATE) as earliest_date, MAX(ORDER_DATE) as latest_date FROM orders
rows •
EARLIEST_DATELATEST_DATE
2021-02-14 08:10:34.0002023-12-31 14:18:01.000

Perfect! Now I have a good understanding of the data. The date range spans from February 2021 to December 2023, which gives us almost 3 years of data as requested. Let's create a notebook to analyze this data.

Now, let's start building our analysis notebook. I'll begin with an introduction and then load the necessary data.

Added Markdown cell
# Monthly Sales Analysis (2021-2023) with Forecast

This notebook analyzes the monthly sales data...
Added Python cell
Error
# Import necessary libraries
import pandas as pd
import numpy as np
import altair as alt
from dat...
Error: Code execution timeout
TimeoutError: Execution failed
Code execution timeout

Let me try again with a simpler import statement:

Updated Python cell
# Import necessary libraries
import pandas as pd
import numpy as np
import altair as alt
from dat...

Let's try again with a simpler approach:

Updated Python cell
Success
# Import necessary libraries
import pandas as pd
import numpy as np
import altair as alt
from dat...
Libraries imported successfully

Now, let's load the data from the database:

Added Python cell
Success
# Load orders data
orders_query = """
SELECT 
    ORDER_ID, 
    CUSTOMER_ID, 
    ORDER_DATE, 
 ...
Loaded 0 orders
Empty DataFrame Columns: [order_id, customer_id, order_date, order_status, total_order_amount] Index: []
1 asset references available in notebook

It seems we didn't get any results with the 'Completed' status filter. Let's check what order statuses are available:

Added Python cell
Success
# Check available order statuses
status_query = """
SELECT DISTINCT ORDER_STATUS
FROM orders
"""
...
order_status 0 Returned 1 Processing 2 Cancelled 3 Delivered 4 Shipped
order_status 0 Returned 1 Processing 2 Cancelled 3 Delivered 4 Shipped

Now let's load the orders with the correct status ('Delivered' instead of 'Completed'):

Added Python cell
Success
# Load orders data with correct status
orders_query = """
SELECT 
    ORDER_ID, 
    CUSTOMER_ID,...
order_id \ 0 a7045f96-cdd6-4238-8c2c-958543c93559 1 6a653e51-23c7-4fb1-8362-b97e1ac4fc7a 2 8c18f867-4592-4ebe-85f7-21d419d2f855 3 c1c4f531-6ae1-4599-bb26-9f962b642c02 4 778fe2dd-b0c7-49af-8363-056182648e56 ... ... 1825 7950252d-4906-4cfd-b094-c994930ee091 1826 bf57e007-2153-435f-a9cc-d1c535687d00 1827 c18c78b1-06ad-48f1-a46b-f493ecd3dd3f 1828 65469c11-4880-46b5-9a49-efef5b207c5f 1829 71c2e36c-f7f0-4216-b363-36f1f1e398d1 customer_id order_date order_status \ 0 1f1c90b4-f599-4087-b1fb-599dabb0e8b2 2021-02-21T05:19:55 Delivered 1 12f8662d-2420-4d63-ac3c-2fa52284d31a 2021-03-06T08:47:13 Delivered 2 32e6512f-abfc-4930-a1ba-a0d92f4941dc 2021-03-16T06:34:13 Delivered 3 67c306e3-2ab1-4967-b351-8799b7e54b4a 2021-03-20T13:46:35 Delivered 4 b48a45af-66a6-4345-b8ee-3aadba59dee2 2021-03-21T08:19:54 Delivered ... ... ... ... 1825 d3d1106f-cb1b-4f77-9e4f-4ffb416206c4 2023-12-30T07:48:40 Delivered 1826 8088745f-a151-4656-9bd7-c46fc406017e 2023-12-30T08:41:21 Delivered 1827 28abf8fc-7dd7-4985-8aa1-a88256c44382 2023-12-30T16:19:13 Delivered 1828 94ec4f8d-4f72-41d1-a3e7-f080f53b9ebb 2023-12-30T23:03:19 Delivered 1829 0f1b17d2-17bd-4370-a3d6-9139fe8dddc9 2023-12-31T14:18:01 Delivered total_order_amount 0 2963.10 1 3697.34 2 4493.23 3 3728.24 4 3544.49 ... ... 1825 6912.76 1826 3749.46 1827 957.85 1828 5976.86 1829 891.05 [1830 rows x 5 columns]
Loaded 1830 orders
order_id customer_id \ 0 a7045f96-cdd6-4238-8c2c-958543c93559 1f1c90b4-f599-4087-b1fb-599dabb0e8b2 1 6a653e51-23c7-4fb1-8362-b97e1ac4fc7a 12f8662d-2420-4d63-ac3c-2fa52284d31a 2 8c18f867-4592-4ebe-85f7-21d419d2f855 32e6512f-abfc-4930-a1ba-a0d92f4941dc 3 c1c4f531-6ae1-4599-bb26-9f962b642c02 67c306e3-2ab1-4967-b351-8799b7e54b4a 4 778fe2dd-b0c7-49af-8363-056182648e56 b48a45af-66a6-4345-b8ee-3aadba59dee2 order_date order_status total_order_amount 0 2021-02-21T05:19:55 Delivered 2963.10 1 2021-03-06T08:47:13 Delivered 3697.34 2 2021-03-16T06:34:13 Delivered 4493.23 3 2021-03-20T13:46:35 Delivered 3728.24 4 2021-03-21T08:19:54 Delivered 3544.49
1 asset references available in notebook

Now let's load the order items data:

Added Python cell
Success
# Load order items data
order_items_query = """
SELECT 
    oi.ORDER_ITEM_ID,
    oi.ORDER_ID,
  ...
order_item_id \ 0 d148790b-7edb-4943-bc24-963a312dbc7c 1 756668ba-1645-4b7e-b2ce-a8fb46e03cac 2 8dbb283f-318c-4a2e-aff1-40499ab55bdf 3 10aaa1a0-2e5a-4f18-8874-d88c6dbaa79c 4 b54d1262-f0a1-431a-a20d-6153454b1a00 ... ... 5517 f3e6f18a-c817-4dd3-b362-0d0e8adb7904 5518 c6b8e965-0c0b-4d04-b305-044e3a5f8c72 5519 adb7a911-9daf-4d89-a7da-670157eb4f9e 5520 f5d65700-9334-4c4c-adbf-13f924aaea2b 5521 e47f6cb1-d483-464b-9a8d-878a63130b03 order_id product_id quantity \ 0 0a3b720f-42b3-4a9b-87be-d9206cbb9dc3 NGG-ECO-W7J 1 1 0a3b720f-42b3-4a9b-87be-d9206cbb9dc3 NGG-CYB-RLW 1 2 0a3b720f-42b3-4a9b-87be-d9206cbb9dc3 NGG-QUA-UG0 1 3 0a3b720f-42b3-4a9b-87be-d9206cbb9dc3 NGG-AUR-U9T 1 4 8213b446-1975-4ea4-933f-0a478b255089 NGG-CHR-QKP 3 ... ... ... ... 5517 11f15579-1fe7-4235-83fa-960f4e6f26f3 NGG-STE-0DG 1 5518 ba4d6ed9-45a1-4bb4-a07e-de5a8b011c49 NGG-NOV-UWH 2 5519 ba4d6ed9-45a1-4bb4-a07e-de5a8b011c49 NGG-CYB-AGZ 2 5520 ba4d6ed9-45a1-4bb4-a07e-de5a8b011c49 NGG-ECO-S9F 1 5521 3e5f0df4-df6d-4c27-85f0-730b33137d01 NGG-STE-6QV 1 price_per_unit_at_purchase total_item_price product_name \ 0 480.58 480.58 AuraLens X1 1 854.97 854.97 NovaGlow AI 2 560.71 560.71 AuraCore Light 3 510.95 510.95 EcoMini S2 4 174.65 523.95 ChronoGlow Connect ... ... ... ... 5517 240.07 240.07 CyberMini P400 5518 1035.88 2071.76 EcoGlow AI 5519 387.78 775.56 EcoGlow Stealth 5520 780.61 780.61 StellarDrive P400 5521 157.63 157.63 PulseWave AI category 0 Smart Home 1 Office Tech 2 VR/AR 3 Smart Home 4 Drones ... ... 5517 Wearables 5518 Gaming Accessories 5519 VR/AR 5520 Smart Home 5521 Drones [5522 rows x 8 columns]
Loaded 5522 order items
order_item_id order_id \ 0 d148790b-7edb-4943-bc24-963a312dbc7c 0a3b720f-42b3-4a9b-87be-d9206cbb9dc3 1 756668ba-1645-4b7e-b2ce-a8fb46e03cac 0a3b720f-42b3-4a9b-87be-d9206cbb9dc3 2 8dbb283f-318c-4a2e-aff1-40499ab55bdf 0a3b720f-42b3-4a9b-87be-d9206cbb9dc3 3 10aaa1a0-2e5a-4f18-8874-d88c6dbaa79c 0a3b720f-42b3-4a9b-87be-d9206cbb9dc3 4 b54d1262-f0a1-431a-a20d-6153454b1a00 8213b446-1975-4ea4-933f-0a478b255089 product_id quantity price_per_unit_at_purchase total_item_price \ 0 NGG-ECO-W7J 1 480.58 480.58 1 NGG-CYB-RLW 1 854.97 854.97 2 NGG-QUA-UG0 1 560.71 560.71 3 NGG-AUR-U9T 1 510.95 510.95 4 NGG-CHR-QKP 3 174.65 523.95 product_name category 0 AuraLens X1 Smart Home 1 NovaGlow AI Office Tech 2 AuraCore Light VR/AR 3 EcoMini S2 Smart Home 4 ChronoGlow Connect Drones
1 asset references available in notebook

Now, let's prepare the data for our monthly sales analysis:

Added Python cell
Success
# Data preparation for monthly sales analysis
orders_df['order_date'] = pd.to_datetime(orders_df[...
Monthly sales data from February 2021 to December 2023
year_month total_sales order_count year month date 0 2021-02 2963.10 1 2021 2 2021-02-01 1 2021-03 20468.03 6 2021 3 2021-03-01 2 2021-04 23923.07 7 2021 4 2021-04-01 3 2021-05 20346.24 4 2021 5 2021-05-01 4 2021-06 34229.42 9 2021 6 2021-06-01
1 asset references available in notebook

Let's visualize the monthly sales trend:

Added Python cell
Success
# Visualize monthly sales trend
monthly_sales_chart = alt.Chart(monthly_sales).mark_line(point=Tr...
2025-05-18 16:28:01,721 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart monthly_sales_chart: data-fef1093d716c9a4847f7e547ed6615e0
{'key': 'monthly_sales_chart', 'content': alt.Chart(...), 'content_type': 'application/vnd.vega.v5+json', 'metadata': {'created': '2025-05-18T16:28:01.663703', 'data_handling': 'referenced_basic_dict'}, 'cell_id': 'af9d45e7-525b-47fe-897c-137a5a5ff8ea', 'cell_version': '2025-05-18T23:28:01.367Z'}
1 asset references available in notebook

Now, let's calculate Year-over-Year (YoY) and Month-over-Month (MoM) growth rates:

Added Python cell
Success
# Calculate Year-over-Year (YoY) growth
monthly_sales['year_month_num'] = monthly_sales['year'] *...
date year_month total_sales prev_year_sales yoy_growth \ 0 2021-02-01 2021-02 2963.10 NaN NaN 1 2021-03-01 2021-03 20468.03 NaN NaN 2 2021-04-01 2021-04 23923.07 NaN NaN 3 2021-05-01 2021-05 20346.24 NaN NaN 4 2021-06-01 2021-06 34229.42 NaN NaN 5 2021-07-01 2021-07 23503.71 NaN NaN 6 2021-08-01 2021-08 55080.62 NaN NaN 7 2021-09-01 2021-09 60609.38 NaN NaN 8 2021-10-01 2021-10 50559.94 NaN NaN 9 2021-11-01 2021-11 60835.74 NaN NaN 10 2021-12-01 2021-12 26324.34 NaN NaN 11 2022-01-01 2022-01 112969.93 NaN NaN 12 2022-02-01 2022-02 71387.39 2963.10 2309.212986 13 2022-03-01 2022-03 129408.15 20468.03 532.245262 14 2022-04-01 2022-04 92169.10 23923.07 285.272877 mom_growth 0 NaN 1 590.764065 2 16.880179 3 -14.951384 4 68.234622 5 -31.334770 6 134.348620 7 10.037578 8 -16.580668 9 20.323996 10 -56.728824 11 329.146296 12 -36.808503 13 81.275923 14 -28.776433
1 asset references available in notebook

Let's visualize the YoY and MoM growth rates:

Added Python cell
Success
# Visualize YoY growth
# Filter out NaN values and extreme outliers for better visualization
yoy_...
2025-05-18 16:28:30,598 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart yoy_growth_chart: data-5b21f961ebd0e468982d496b74b4652e
2025-05-18 16:28:30,824 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart mom_growth_chart: data-d48afed013cf4139b6fbec583befa945
{'key': 'mom_growth_chart', 'content': alt.Chart(...), 'content_type': 'application/vnd.vega.v5+json', 'metadata': {'created': '2025-05-18T16:28:30.818990', 'data_handling': 'referenced_basic_dict'}, 'cell_id': '53eb8ef8-52f4-4993-a07d-f384b0ce5909', 'cell_version': '2025-05-18T23:28:30.296Z'}
2 asset references available in notebook

Now, let's analyze the top-performing products by revenue and quantity:

Added Python cell
Success
# Analyze top-performing products by revenue
top_products_revenue = order_items_df.groupby(['prod...
Top 10 Products by Revenue:
product_id product_name category total_revenue \ 50 NGG-NOV-UWH EcoGlow AI Gaming Accessories 252754.72 53 NGG-PUL-BQ3 CyberShift AI Drones 223646.00 32 NGG-ECO-DUS ChronoDrive Light Drones 216786.54 25 NGG-CYB-RLW NovaGlow AI Office Tech 209467.65 20 NGG-CYB-9YJ ChronoBuds Z9000 Gaming Accessories 200122.75 1 NGG-AUR-0HZ NovaPro AI Drones 175092.96 5 NGG-AUR-TMB ChronoMini AI Office Tech 169418.31 48 NGG-NOV-JTH PulseMax AI VR/AR 165548.60 2 NGG-AUR-16C StellarMini Z9000 Audio 158671.37 64 NGG-STE-6YW PulsePro AI VR/AR 147275.92 total_quantity avg_price order_count 50 244 1035.88 119 53 200 1118.23 102 32 201 1078.54 102 25 245 854.97 119 20 137 1460.75 65 1 164 1067.64 84 5 137 1236.63 69 48 140 1182.49 67 2 221 717.97 112 64 199 740.08 106
2 asset references available in notebook
Added Python cell
Success
# Display top 10 products by quantity
print("Top 10 Products by Quantity:")
top_products_quantity...
Top 10 Products by Quantity:
product_id product_name category total_revenue \ 9 NGG-CHR-55E CyberBot Connect Audio 93219.84 25 NGG-CYB-RLW NovaGlow AI Office Tech 209467.65 50 NGG-NOV-UWH EcoGlow AI Gaming Accessories 252754.72 65 NGG-STE-ALA AuraGlow P400 Office Tech 91312.11 6 NGG-AUR-U9T EcoMini S2 Smart Home 123138.95 30 NGG-ECO-99U PulseMax X1 Wearables 6191.60 33 NGG-ECO-F48 PulseWave AI Gaming Accessories 80113.14 2 NGG-AUR-16C StellarMini Z9000 Audio 158671.37 62 NGG-STE-6QV PulseWave AI Drones 34520.97 11 NGG-CHR-825 ChronoPro Stealth Office Tech 38387.52 total_quantity avg_price order_count 9 256 364.14 128 25 245 854.97 119 50 244 1035.88 119 65 243 375.77 116 6 241 510.95 114 30 230 26.92 111 33 222 360.87 117 2 221 717.97 112 62 219 157.63 109 11 216 177.72 106

Let's visualize the top products by revenue and quantity:

Added Python cell
Success
# Visualize top 10 products by revenue
top_10_revenue = top_products_revenue.head(10).copy()
top_...
2025-05-18 16:29:14,241 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart top_revenue_chart: data-b10587400f51fab50ceb2665d52591c1
2025-05-18 16:29:14,546 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart top_quantity_chart: data-739e24f99b9b37a889f2b93808ad9471
{'key': 'top_quantity_chart', 'content': alt.Chart(...), 'content_type': 'application/vnd.vega.v5+json', 'metadata': {'created': '2025-05-18T16:29:14.543014', 'data_handling': 'referenced_basic_dict'}, 'cell_id': '0df084ff-77e1-4718-8fb5-a1c19626fe2b', 'cell_version': '2025-05-18T23:29:13.967Z'}
2 asset references available in notebook

Now, let's analyze the sales by category:

Added Python cell
Success
# Analyze sales by category
category_sales = order_items_df.groupby('category').agg(
    total_re...
category total_revenue total_quantity avg_price \ 3 Office Tech 1200156.65 2471 489.088217 1 Drones 1145261.50 1824 624.699332 5 VR/AR 1062347.16 1818 581.085329 2 Gaming Accessories 1001966.11 1895 521.008113 4 Smart Home 709809.71 1405 497.522066 0 Audio 522949.67 904 578.775274 6 Wearables 123830.17 713 175.041121 product_count order_count 3 15 924 1 13 766 5 13 753 2 12 772 4 8 599 0 5 418 6 4 331
1 asset references available in notebook

Let's visualize the sales by category:

Added Python cell
Success
# Visualize sales by category
category_revenue_chart = alt.Chart(category_sales).mark_bar().encod...
2025-05-18 16:29:53,071 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart category_revenue_chart: data-9b272f7e67d408cb8dd73ff1d46eda70
2025-05-18 16:29:53,270 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart category_quantity_chart: data-9b272f7e67d408cb8dd73ff1d46eda70
{'key': 'category_quantity_chart', 'content': alt.Chart(...), 'content_type': 'application/vnd.vega.v5+json', 'metadata': {'created': '2025-05-18T16:29:53.264549', 'data_handling': 'referenced_basic_dict'}, 'cell_id': '56017380-b182-479a-8d4c-da8680512d9e', 'cell_version': '2025-05-18T23:29:52.803Z'}
2 asset references available in notebook

Now, let's analyze the seasonality patterns in the sales data:

Added Python cell
Success
# Analyze seasonality patterns
# Group by month (regardless of year) to see monthly patterns
mont...
month total_sales order_count month_name 0 1 345623.81 101 January 1 2 287012.11 100 February 2 3 402643.34 123 March 3 4 279549.97 93 April 4 5 488953.77 145 May 5 6 458750.75 149 June 6 7 484122.40 149 July 7 8 537112.98 176 August 8 9 539766.69 179 September 9 10 611295.55 195 October 10 11 664170.38 215 November 11 12 598030.60 205 December
1 asset references available in notebook

Let's visualize the monthly seasonality patterns:

Added Python cell
Success
# Visualize monthly seasonality patterns
monthly_pattern_chart = alt.Chart(monthly_pattern).mark_...
2025-05-18 16:30:11,947 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart monthly_pattern_chart: data-09976f1edd6ccc3ee63fe63e54b36ac6
2025-05-18 16:30:12,159 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart monthly_order_chart: data-09976f1edd6ccc3ee63fe63e54b36ac6
{'key': 'monthly_order_chart', 'content': alt.Chart(...), 'content_type': 'application/vnd.vega.v5+json', 'metadata': {'created': '2025-05-18T16:30:12.154270', 'data_handling': 'referenced_basic_dict'}, 'cell_id': 'aaa39327-0382-4cf0-9717-32fcb959b2ea', 'cell_version': '2025-05-18T23:30:11.721Z'}
2 asset references available in notebook

Now, let's analyze the quarterly patterns:

Added Python cell
Success
# Analyze quarterly patterns
orders_df['quarter'] = ((orders_df['month'] - 1) // 3) + 1
quarterly...
year quarter total_sales order_count quarter_name year_quarter 0 2021 1 23431.13 7 Q1 2021-Q1 1 2021 2 78498.73 20 Q2 2021-Q2 2 2021 3 139193.71 45 Q3 2021-Q3 3 2021 4 137720.02 49 Q4 2021-Q4 4 2022 1 313765.47 93 Q1 2022-Q1 5 2022 2 394368.26 117 Q2 2022-Q2 6 2022 3 429258.41 128 Q3 2022-Q3 7 2022 4 469347.93 164 Q4 2022-Q4 8 2023 1 698082.66 224 Q1 2023-Q1 9 2023 2 754387.50 250 Q2 2023-Q2 10 2023 3 992549.95 331 Q3 2023-Q3 11 2023 4 1266428.58 402 Q4 2023-Q4
1 asset references available in notebook

Let's visualize the quarterly patterns:

Added Python cell
Success
# Visualize quarterly patterns
quarterly_chart = alt.Chart(quarterly_pattern).mark_bar().encode(
...
2025-05-18 16:30:33,075 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart quarterly_chart: data-4270b454b499f4b8e96b35a034492268
2025-05-18 16:30:33,290 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart quarterly_growth_chart: data-c840983ff9ad075b5d0ecd749ff1b53d
{'key': 'quarterly_growth_chart', 'content': alt.Chart(...), 'content_type': 'application/vnd.vega.v5+json', 'metadata': {'created': '2025-05-18T16:30:33.285820', 'data_handling': 'referenced_basic_dict'}, 'cell_id': '9e691cd6-9c2c-4a8b-a7d5-15af3d22e9d1', 'cell_version': '2025-05-18T23:30:32.858Z'}
2 asset references available in notebook

Now, let's prepare the data for time series forecasting:

Added Python cell
Success
# Prepare data for time series forecasting
# We'll use the monthly sales data
forecast_data = mon...
Time series data from February 2021 to December 2023
total_sales date 2021-02-01 2963.10 2021-03-01 20468.03 2021-04-01 23923.07 2021-05-01 20346.24 2021-06-01 34229.42

Let's perform time series decomposition to understand the seasonality better:

Added Python cell
Success
# Perform time series decomposition
# We need at least 2 full cycles for seasonal decomposition, ...
observed trend seasonal residual date 2021-02-01 2963.10 NaN -10506.261319 NaN 2021-03-01 20468.03 NaN 27328.337222 NaN 2021-04-01 23923.07 NaN -49402.001528 NaN 2021-05-01 20346.24 NaN 41416.583264 NaN 2021-06-01 34229.42 NaN 3489.039306 NaN
1 asset references available in notebook

Let's visualize the decomposition:

Added Python cell
Success
# Visualize the decomposition
# Reset index to use date as a column
decomp_viz_data = decomp_data...
2025-05-18 16:31:31,057 - app.kernel_configuration.asset_store - WARNING - basic to_dict() resulted in referenced data for chart decomposition_chart: data-e5be03514995da09d164252dd9539918
{'key': 'decomposition_chart', 'content': alt.VConcatChart(...), 'content_type': 'application/vnd.vega.v5+json', 'metadata': {'created': '2025-05-18T16:31:31.052234', 'data_handling': 'referenced_basic_dict'}, 'cell_id': '094605c2-6941-4249-a7b8-a7d8613c0819', 'cell_version': '2025-05-18T23:31:30.824Z'}
1 asset references available in notebook

Now, let's build a SARIMA model for forecasting:

Added Python cell
Success
# Build a SARIMA model for forecasting
# SARIMA parameters
# p, d, q: Non-seasonal parameters (AR...
SARIMAX Results ========================================================================================== Dep. Variable: total_sales No. Observations: 35 Model: SARIMAX(1, 1, 1)x(1, 1, 1, 12) Log Likelihood -98.553 Date: Sun, 18 May 2025 AIC 207.107 Time: 16:31:45 BIC 207.504 Sample: 02-01-2021 HQIC 204.428 - 12-01-2023 Covariance Type: opg ============================================================================== coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------ ar.L1 -0.0891 1.436 -0.062 0.951 -2.904 2.726 ma.L1 -0.3922 2.069 -0.190 0.850 -4.448 3.664 ar.S.L12 -5.7309 4.514 -1.270 0.204 -14.577 3.116 ma.S.L12 5.7269 4.408 1.299 0.194 -2.913 14.367 sigma2 1.702e+09 6.99e-09 2.43e+17 0.000 1.7e+09 1.7e+09 =================================================================================== Ljung-Box (L1) (Q): 0.69 Jarque-Bera (JB): 0.82 Prob(Q): 0.41 Prob(JB): 0.66 Heteroskedasticity (H): 1.18 Skew: -0.60 Prob(H) (two-sided): 0.90 Kurtosis: 1.99 =================================================================================== Warnings: [1] Covariance matrix calculated using the outer product of gradients (complex-step). [2] Covariance matrix is singular or near-singular, with condition number 1.07e+34. Standard errors may be unstable.

Let's generate the forecast for the next 6 months:

Added Python cell
Success
# Generate forecast for the next 6 months
forecast_steps = 6
forecast = model_fit.get_forecast(st...
Sales Forecast for the Next 6 Months:
forecast lower_ci upper_ci month year \ 2024-01-01 484325.347514 379582.286092 589068.408937 1 2024 2024-02-01 312059.717161 39613.717614 584505.716708 2 2024 2024-03-01 440135.213592 81845.804885 798424.622299 3 2024 2024-04-01 428965.877825 13594.896365 844336.859284 4 2024 2024-05-01 457401.960101 -7608.284553 922412.204756 5 2024 2024-06-01 356966.482711 -152609.500022 866542.465445 6 2024 year_month 2024-01-01 2024-01 2024-02-01 2024-02 2024-03-01 2024-03 2024-04-01 2024-04 2024-05-01 2024-05 2024-06-01 2024-06
1 asset references available in notebook

Let's visualize the forecast:

Added Python cell
Success
# Visualize the forecast
# Prepare data for visualization
historical_data = forecast_data.reset_i...
{'key': 'forecast_chart', 'content': alt.LayerChart(...), 'content_type': 'application/vnd.vega.v5+json', 'metadata': {'created': '2025-05-18T16:32:23.615951', 'data_handling': 'no_data'}, 'cell_id': 'e16bc8b0-40d4-4c59-a8a7-fcd8a322f433', 'cell_version': '2025-05-18T23:32:23.244Z'}
1 asset references available in notebook

Now, let's add a summary of our findings:

Added Markdown cell
# Summary of Sales Analysis and Forecast

## Key Growth Trends

### Year-over-Year (YoY) Growth
-...
Added Markdown cell
# Conclusion and Recommendations

Based on our comprehensive analysis of Nova Glow's sales data f...

I've completed a comprehensive analysis of the monthly sales data for Nova Glow from 2021 to 2023, as requested. Here's a summary of what I've done:

  1. Data Analysis and Visualization:

    • Created line charts showing monthly sales trends over the three-year period
    • Calculated and visualized Year-over-Year (YoY) and Month-over-Month (MoM) growth rates
    • Identified and visualized seasonality patterns through monthly and quarterly analysis
    • Created bar charts for top-performing products by revenue and quantity sold
    • Analyzed sales performance by product category
  2. Seasonality Identification:

    • Identified peak sales months (November, October, December) and low sales months (February, April, January)
    • Confirmed the holiday season spike in Q4 (October-December)
    • Discovered a secondary peak during summer months (July-September)
    • Used time series decomposition to separate trend, seasonal, and residual components
  3. Sales Forecast:

    • Built a SARIMA (1,1,1)x(1,1,1,12) model to forecast sales for the next 6 months
    • Provided forecasted values with 95% confidence intervals
    • Visualized the forecast alongside historical data
    • Explained the model choice and key assumptions
  4. Key Findings and Recommendations:

    • Documented the exceptional growth trajectory (300% YoY from 2021-2022, 100% YoY from 2022-2023)
    • Identified top-performing products and categories
    • Provided strategic recommendations for inventory planning, product development, marketing, operations, and growth opportunities

The notebook is organized in a logical flow, starting with data exploration, moving through trend analysis, seasonality identification, product performance, and ending with the forecast and recommendations. All visualizations are interactive, allowing for deeper exploration of the data.

Is there any specific aspect of the analysis you'd like me to explain in more detail or any additional analysis you'd like me to perform?

May 18, 11:33 PM