Skip to main content
We’ve updated our Terms of Service. A new AI Addendum clarifies how Stack Overflow utilizes AI interactions.
Filter by
Sorted by
Tagged with
2 votes
3 answers
112 views

Calculate difference between two values, including those only appearing once within the partition

DB<>Fiddle CREATE TABLE inventory ( id SERIAL PRIMARY KEY, stock_date DATE, product VARCHAR, stock_balance INT ); INSERT INTO inventory (stock_date, product, stock_balance)VALUES ...
Michi's user avatar
  • 5,565
3 votes
3 answers
123 views

How to retrieve a sub-array from result of array_agg?

I have a SQL table in postgres 14 that looks something like this: f_key data1 data2 fit 1 {'a1', 'a2'} null 3 1 {'b1', 'b2'} {'b3'} 2 2 {'c1', 'c2'} null 3 Note that data1 and data2 are arrays. I need ...
fitek's user avatar
  • 303
1 vote
2 answers
164 views

How to efficiently calculate an exponential moving average in postgres?

I'm trying to calculate the average true range on some time series dataset stored in postgres. Its calculation requires a 14 period exponential moving average of true range which based on the answer ...
user31749517's user avatar
4 votes
4 answers
216 views

Sort aggregated query results by two methods simultaneously

I need to sort a query's results by two methods at the same time. I want the first 3 records (returned) to be based on their prevalence in another table And then I want the rest of the results sorted ...
Gavin Baumanis's user avatar
-1 votes
0 answers
145 views

How to aggregate a group by query in django?

I'm working with time series data which are represented using this model: class Price: timestamp = models.IntegerField() price = models.FloatField() Assuming timestamp has 1 min interval data,...
user31749517's user avatar
-1 votes
2 answers
191 views

Calculate SUM over a primary key and between dates

My query: SELECT c.CustID, o.OrderID, SUM(ol.Qty * ol.Price) AS SUMOrder, AVG(SUM(ol.Qty * ol.Price)) OVER (PARTITION BY c.CustID) AS AVGAllOrders, COUNT(*) AS Countorders, SUM(...
Neccehh's user avatar
  • 41
-1 votes
1 answer
169 views

Assign unique values in a set-based approach

Simplifying, I have the following data: Col1 Col2 A X A Y A Z B X B Y B Z C Z I need to receive the following result: Col1 Col2 A X B Y C Z In other words: For each value in the left column, I need to ...
Hammy's user avatar
  • 11
0 votes
0 answers
64 views

Polars bug using windowed aggregate functions on Decimal type columns

Windowed aggregate functions on Decimal-types move decimals to integers I found a bug in polars (version 1.21.0 in a Python 3.10.8 environment) using windowed aggregate functions. They are not ...
jpm_phd's user avatar
  • 935
3 votes
1 answer
117 views

Why `.first()`, and why before `.over()`, in `with_columns` expression function composition chain

new to Polars, seeking help understanding why part of the function composition for the expression in the .with_columns() snippet below has to be done in that particular order. Specifically, I don't ...
user1665921's user avatar
2 votes
2 answers
69 views

Compute group-wise residual for polars data frame

I am in a situation where I have a data frame with X and X values as well as two groups GROUP1 and GROUP2. Looping over both of the groups, I want to fit a linear model against the X and Y data and ...
Thomas's user avatar
  • 1,351
0 votes
1 answer
54 views

BigQuery get rolling average of variable 1 if variable 2 >= quantile

Say I want to get the rolling average of variable x where a second variable y is in the top 5th percentile (over that window). I can get the rolling average alone with something like this SELECT ...
dfried's user avatar
  • 567
1 vote
1 answer
43 views

How to calculate the maximum drawdown of a stock over a rolling time window?

In quantitative finance, maximum drawdown is a key risk metric that measures the largest decline from a peak to a trough over a period. I want to calculate the maximum drawdown over the past 10 ...
Huang WeiFeng's user avatar
2 votes
1 answer
199 views

Find corresponding date of max value in a rolling window of each partition

Sample code: import polars as pl from datetime import date from random import randint df = pl.DataFrame({ "category": [cat for cat in ["A", "B"] for _ in range(1, ...
Jonathan's user avatar
  • 2,333
1 vote
1 answer
135 views

Get a grouped sum in polars, but keep all individual rows

I am breaking my head over this probably pretty simply question and I just can't find the answer anywhere. I want to create a new column with a grouped sum of another column, but I want to keep all ...
gernophil's user avatar
  • 637
1 vote
1 answer
59 views

Group-By column in polars DataFrame inside with_columns

I have the following dataframe: import polars as pl df = pl.DataFrame({ 'ID': [1, 1, 5, 5, 7, 7, 7], 'YEAR': [2025, 2025, 2023, 2024, 2020, 2021, 2021] }) shape: (7, 2) ┌─────┬──────┐ │ ID ┆ ...
Phil-ZXX's user avatar
  • 3,601
1 vote
1 answer
78 views

In PostgreSQL do ranking window functions heed the window frame or act on the entire partition?

I am learning window functions, primarily with this page of the docs. I am trying to categorize the window functions according to whether they heed window frames, or ignore them and act on the ...
Logan O'Brien's user avatar
3 votes
2 answers
81 views

How to filter sequential event data according to whether record is followed by specific event within X minutes?

I have some data with a timestamp column t, an event category column cat, and a user_id column. cat can take n values, including value A. I want to select records which are followed (not necessarily ...
Max Davy's user avatar
1 vote
1 answer
71 views

Median with a sliding window

The goal is to use MEDIAN as a window function with a sliding window of a specific size. SELECT *, MEDIAN(n) OVER(ORDER BY id ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) FROM test_data ORDER BY id;...
Lukasz Szozda's user avatar
1 vote
2 answers
87 views

How to get the max amount per day for a month

I have a table with two columns: demo at db<>fiddle create table your_table("Date","Count")as values ('2022-01-13'::date, 8) ,('2022-01-18'::date, 14) ,('2022-01-25'::...
Owen's user avatar
  • 13
2 votes
2 answers
73 views

Identify duplicates within a period of time using Redshift SQL

In a table, I have plan details of customers with their customer_id and enroll_date. Now, I want to identify duplicate and valid enrollments from the overall data. Duplicate: If a customer enrolls a ...
Lakshmi Sruthi K's user avatar
1 vote
1 answer
134 views

How to Exclude Rows Based on a Dynamic Condition in a PySpark Window Function?

I am working with PySpark and need to create a window function that calculates the median of the previous 5 values in a column. However, I want to exclude rows where a specific column feature is True. ...
user29963762's user avatar
1 vote
1 answer
62 views

MySQL filtered gaps and islands: avoiding temporaries and filesorts?

CREATE TABLE `messages` ( `ID` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, `Arrival` TIMESTAMP NOT NULL, `SenderID` INT UNSIGNED NOT NULL, -- Fields describing messages skipped PRIMARY ...
Dmitry Vasiliev's user avatar
0 votes
1 answer
55 views

Sum Time Differences over multiple groups in MySQL

I have a table in MySQL... # id, admin_id, appointment_id, timestamp '1', '10', '1', '2025-03-01 08:00:00' '2', '10', '1', '2025-03-01 09:00:00' '3', '10', '2', '2025-04-01 08:00:00' '4', '10', '2', '...
AQuirky's user avatar
  • 5,316
1 vote
1 answer
85 views

Aggregate 3-month rolling dates with overlap [closed]

Suppose I have below dataset: date Value 01-Jul-24 37 01-Aug-24 76 01-Sep-24 25 01-Oct-24 85 01-Nov-24 27 01-Dec-24 28 And I want to aggregate by 3 months rolling:...
ccgg's user avatar
  • 13
0 votes
1 answer
62 views

segmented monthly snapshots of validly eligible user counts

I've been trying to figure out a SQL (in postgresql) query for a cohort-type analysis at work and can't figure this one out for the life of me. I need a snapshot count of the number of valid users at ...
Eleanor Brock's user avatar
1 vote
2 answers
163 views

How can I perform a calculation on a rolling window over a partition in polars?

I have a Dataset containing GPS Coordinates of a few planes. I would like to calculate the bearing of each plane at every point in time. The Dataset as among others these columns: event_uid plane_no ...
jimfawkes's user avatar
  • 385
1 vote
1 answer
87 views

How to conditionally choose which column to backward fill over in polars?

I need to backfill a column over one of three possible columns, based on which one matches the non-null cell in the column to be backfilled. My dataframe looks something like this: import polars as pl ...
epistemetrica's user avatar
4 votes
1 answer
109 views

Grouped Rolling Mean in Polars

Similar question is asked here However it didn't seem to work in my case. I have a dataframe with 3 columns, date, groups, prob. What I want is to create a 3 day rolling mean of the prob column values ...
AColoredReptile's user avatar
-1 votes
1 answer
114 views

How can I apply a filter to a window function in Snowflake?

Suppose I have a table like this TRANSACTION_DATE BOOKED_DATE AMOUNT 2024-02-10 2024-02-09 50 2024-02-10 2024-02-10 50 2024-02-10 2024-02-11 50 2024-02-11 2024-02-10 50 2024-02-11 2024-02-11 50 2024-...
Peter Olson's user avatar
1 vote
2 answers
106 views

SQL Window Functions - Pivot on a Column

I have a table data as show below. cust_id city_type city_name start_date 1 physical Las Vegas 5/17/2024 1 office Seattle 5/17/2024 1 office Dallas 9/20/2024 1 physical Dallas 10/30/2024 1 office ...
ragstand's user avatar
3 votes
2 answers
101 views

How to count people inside the building using entrance/leaving logs in PostgreSQL

I have a table with logs of going inside and outside the building. The table looks like that: user_id datetime direction 1 17/2/2025, 18:25:02.000 in 1 17/2/2025, 20:09:10.000 out 2 17/2/2025, 09:55:...
Daniel G's user avatar
0 votes
3 answers
120 views

Oracle Max Over Partition By Excluding Current Row

I have an issue to calculate the max() value over partition by where i want to exclude the current row each time. Assuming I have a table with ID, Group and Valbue. calculating max/min/etc. over ...
Rabers's user avatar
  • 45
1 vote
0 answers
73 views

pivot vs window in spark

I have the following requirement Pivot the dataframe to sum amount column based on document type Join the pivot dataframe back to the original dataframe to get additional columns Filter the joined ...
Dhruv's user avatar
  • 597
8 votes
1 answer
414 views

Replacing window function OVER() with WINDOW clause reference yields different results

While preparing an answer to another question here, I coded up a query that contained multiple window functions having the same OVER(...) clause. Results were as expected. select ... sum(sum(s....
T N's user avatar
  • 10.6k
2 votes
3 answers
265 views

Stratified sampling using SQL given an absolute sample size

I have the following population: a b b c c c c I am looking for a SQL statement to generate a the stratified sample of arbitrary size. Let's say for this example, I would like a sample size of 4. I ...
Saqib Ali's user avatar
  • 4,551
1 vote
1 answer
101 views

Update every N rows with an increment of 1

I am an SQL server developer working on a project in a PostgreSQL environment. I am having some PostgreSQL syntax issues. I am working off version 9.3. In a given table, I am trying to set every 10 ...
Peter Sun's user avatar
  • 1,943
0 votes
0 answers
69 views

How to pick change values from a column other than using window functions in Snowflake

I have a Snowflake table with data like below: Table1 Col1 Col2 Col3 G1 1 9:15 G1 1 9:16 G1 2 9:17 G1 1 9:18 G2 1 9:15 G2 2 9:16 I want to ...
Prachi's user avatar
  • 564
0 votes
0 answers
16 views

Is there a good way to add columns calculated using a window partition within pandas chaining

My background is in SQL and I was wondering what was the most efficient/readable way of creating multiple columns using the same window partition within a pandas chain. Suppose I have the following ...
gjk515's user avatar
  • 23
0 votes
2 answers
69 views

Calculate Date Difference for Non-Consecutive Months

I am trying to find gaps in enrollment and have a table set up like this: ID Enrollment _Month Consecutive_Months 1 202403 1 1 202404 2 1 202405 3 1 202409 1 1 202410 2 1 202411 3 2 202401 1 2 202402 ...
Sophia's user avatar
  • 89
0 votes
1 answer
64 views

spark scala ignore nulls in windowing clause

In spark SQl, you can write SELECT title, rn, lead(rn, 1) IGNORE NULLS over(order by rn) as next_rn FROM my_table ; How would you add the IGNORE NULLS part in the equivalent Scala code? val ...
M.S.Visser's user avatar
0 votes
3 answers
73 views

Most recent status of each item as of the 1st of each month

I have a table that is structured in the following way: fiddle create table test(id,status,datechange)as values ('011AVN', 11, '2024-06-21 08:27:13'::timestamp) ,('011AVN', 12, '2024-06-21 08:28:16') ...
HappyTaco's user avatar
0 votes
3 answers
200 views

Using SQL Server window functions with year and month(Period of time)

Please consider this script: Declare @tbl Table ( F1 int, F2 int, Year int, Month tinyint ) Insert into @tbl values (10, 1, 2020, 1), (10, 1, 2020, 2), (10, 1, 2020, 3), (10, ...
DooDoo's user avatar
  • 13.1k
-2 votes
1 answer
75 views

How to rank rows considering ties?

How to show numbers 1 1 3 4 5 5 7... in PostgreSQL query Example: create table test(name,sum_all)as values ('a',100) ,('b',95) ,('c',100) ,('d',75) ,('e',55); Desired results name sum_all ...
momoman's user avatar
  • 39
1 vote
1 answer
104 views

How to assign unique UUIDs to groups of metrics in a PostgreSQL table with repeated names?

I’m working with a PostgreSQL table that stores metric data for different assets. The table currently has over 1 billion records. Each update will have multiple metrics, e.g., speed, distance, ...
NRaf's user avatar
  • 7,579
0 votes
1 answer
62 views

Copy an ID to rows with adjacent timestamp ranges sharing a class

Here is a sample table. create table test(ID,Start_date_time,End_date_time,class) as values (131, '5/26/2021 11:42', '5/26/2021 12:42', 'AAA') ,(132, '5/26/2021 12:42', '5/26/2021 13:18', 'AAA')...
TCO's user avatar
  • 177
1 vote
1 answer
75 views

Sum a Column of a Timeseries by an Order Number when the Ordernumber is not unique

I have a table like this: demo at db<>fiddle CREATE TABLE test(id, order_id, start, end1, count) AS VALUES (1, 1, '2023-12-19 10:00:00'::timestamp, '2023-12-19 11:00:00'::timestamp, 15), (2, 1, '...
Axel Siebert's user avatar
1 vote
5 answers
135 views

Streak for a given endDate SQL (Postgres)

Input data date number 2024-11-02 1000 2024-11-03 500 2024-11-05 1000 2024-11-06 1000 2024-11-07 1000 2024-11-08 500 2024-11-14 1000 2024-11-15 1000 for a given date I want to get the streak (dates ...
user1117605's user avatar
1 vote
4 answers
117 views

Number of missing periods between dates

I'm using Postgres and I would like to find missing ranges of dates. I've got this table with these data : create table event_dates(date)AS VALUES('2024-12-09'::date) ...
lcc's user avatar
  • 13
2 votes
2 answers
94 views

How can I force a WINDOW function in MySQL to show 'NULL' unless complete window frame is available?

I want to get moving sum and moving average on each date for last 7 days (including current day). I used WINDOW function and used ROWS BETWEEN to frame the function which calculates correctly, but it ...
Syed Talha Tariq's user avatar
2 votes
1 answer
80 views

How to calculate average stock status in day

Stock status for days is in table create table stockstatus ( stockdate date not null, -- date of stock status product character(60) not null, -- product id status int not null, -- stock status in ...
Andrus's user avatar
  • 28.2k

1
2 3 4 5
93