Alternate Numbering rows if the data is the different - sql

Current Data:
|LotNumber |ItemNumber |ItemName
|76 |1 |WidgetRight
|76 |1 |WidgetLeft
|76 |2 |WidgetRight
|76 |2 |WidgetLeft
|76 |7 |WidgetRight
|76 |7 |WidgetLeft
|76 |9 |WidgetRight
|76 |10 |WidgetRight
|76 |10 |WidgetLeft
|80 |3 |WidgetRight
|80 |3 |WidgetLeft
|80 |7 |WidgetRight
|80 |7 |WidgetLeft
|80 |11 |WidgetRight
|80 |31 |WidgetRight
Need to add a column that alternates between 1 and 2 every time the ItemNumber changes
|LotNumber |ItemNumber |ItemName |NewField
|76 |1 |WidgetRight |1
|76 |1 |WidgetLeft |1
|76 |2 |WidgetRight |2
|76 |2 |WidgetLeft |2
|76 |7 |WidgetRight |1
|76 |7 |WidgetLeft |1
|76 |9 |WidgetRight |2
|76 |10 |WidgetRight |1
|76 |10 |WidgetLeft |1
|80 |3 |WidgetRight |2
|80 |3 |WidgetLeft |2
|80 |7 |WidgetRight |1
|80 |7 |WidgetLeft |1
|80 |11 |WidgetRight |2
|80 |31 |WidgetRight |1
Current Code not working ...
DECLARE #LastLine Varchar(1)
SELECT ItemNumber, ItemName,
CASE WHEN ItemNumber <> LAG (ItemNumber, 1, 0) OVER (PARTITION BY LotNumber ORDER BY LotNumber, ItemNumber )
THEN
CASE WHEN #LastLine = '1'
THEN '2'
ELSE '1'
END
END AS LastLine
Thank you for any help.

You can just use dense_rank() and arithmetics:
select t.*,
1 + (dense_rank() over(order by lotnumber, itemnumber) + 1) % 2 as newfield
from mytable t
Demo on DB Fiddle

Related

Is it possible to find the sum of values on one row sqlite?

If I have data in a table with integers like the example below, is it possible to calculate for each row the sum of several columns and output that sum as well as several other columns through an sqlite query command?
My table looks like this below
|Timestamp |Email |Name |Year|Make |Model |Car_ID|Judge_ID|Judge_Name|Racer_Turbo|Racer_Supercharged|Racer_Performance|Racer_Horsepower|Car_Overall|Engine_Modifications|Engine_Performance|Engine_Chrome|Engine_Detailing|Engine_Cleanliness|Body_Frame_Undercarriage|Body_Frame_Suspension|Body_Frame_Chrome|Body_Frame_Detailing|Body_Frame_Cleanliness|Mods_Paint|Mods_Body|Mods_Wrap|Mods_Rims|Mods_Interior|Mods_Other|Mods_ICE|Mods_Aftermarket|Mods_WIP|Mods_Overall|
|--------------|---------------------------|----------|----|--------|---------|------|--------|----------|-----------|------------------|-----------------|----------------|-----------|--------------------|------------------|-------------|----------------|------------------|------------------------|---------------------|-----------------|--------------------|----------------------|----------|---------|---------|---------|-------------|----------|--------|----------------|--------|------------|
|8/5/2018 14:10|honoland13#japanpost.jp |Hernando |2015|Acura |TLX |48 |J04 |Bob |0 |0 |2 |2 |4 |4 |0 |2 |4 |4 |2 |4 |2 |2 |2 |2 |2 |0 |4 |4 |4 |6 |2 |0 |4 |
|8/5/2018 15:11|nlighterness2q#umn.edu |Noel |2015|Jeep |Wrangler |124 |J02 |Carl |0 |6 |4 |2 |4 |6 |6 |4 |4 |4 |6 |6 |6 |6 |6 |4 |6 |6 |6 |6 |6 |4 |6 |4 |6 |
|8/5/2018 17:10|eguest47#microsoft.com |Edan |2015|Lexus |Is250 |222 |J05 |Adrian |0 |0 |0 |0 |0 |0 |0 |0 |6 |6 |6 |0 |0 |6 |6 |6 |0 |0 |0 |0 |0 |0 |0 |0 |4 |
|8/5/2018 17:34|hchilley40#fema.gov |Hieronymus|1993|Honda |Civic eG |207 |J06 |Aaron |0 |0 |2 |2 |2 |2 |2 |2 |0 |4 |2 |2 |2 |2 |2 |2 |4 |2 |2 |0 |0 |0 |2 |2 |0 |
|8/5/2018 14:30|nnowick3d#tuttocitta.it |Nickolas |2016|Ford |Mystang |167 |J02 |Carl |0 |0 |2 |2 |0 |2 |2 |0 |0 |0 |0 |2 |0 |2 |2 |2 |0 |0 |2 |0 |0 |0 |0 |0 |2 |
|8/5/2018 16:12|mdearl39#amazon.co.uk |Martin |2013|Hyundai |Gen coupe|159 |J04 |Bob |0 |0 |2 |0 |0 |0 |2 |0 |0 |0 |0 |2 |0 |2 |2 |0 |2 |0 |2 |0 |0 |0 |0 |0 |0 |
How can I find the sum from column 10 to 34 for each row, then output each row up to column 7 followed by a column with the total for each row? So far I've only figured out how to get the sum for each column individually but not to across several columns for each row and to output each the desired columns.
SELECT Car_ID, Year, Make, Model, SUM(Mods_ICE) FROM Carstable
But this only outputs data for one row at the bottom of the table with the sum. Expected outcome would be something like below
|Car_ID|Year |Make |Model |Total |
|------|------|------|---------|-------|
|48 |2015 |Acura |TLX |89 |
|22 |2015 |Chevy |Camaro |101 |
|19 |2006 |Ford |Mustang |55 |
|101 |2011 |Subaru|WRX |91 |
For sum of columns in a single row you need no extra function like SUM. Use + oerator:
SELECT column10 + ... + column34 FROM Carstable

Filtering a pyspark dataframe to get rows from last N days doesn't work correctly and returns some data before the stated date

I added a new column to my spark dataframe by combining these 4 columns year,month,day,time_hour as follow:
df = df.withColumn("tmp_timestamp", concat_ws(" ", concat_ws("_", col("year"), col("month"), col("day")), col("time_hour")))\
.withColumn("tsmp", unix_timestamp("tmp_timestamp", "yyyy_MM_dd HH"))\
.withColumn("date_time_EET", from_utc_timestamp(col("tsmp").cast('timestamp'), "EET"))\
.drop("tmp_timestamp")
And here is the result:
+-------+--------+------+---------+-------------+----------+---------------------+
|year |month |day |time_hour|tmp_timestamp|tsmp |date_time_EET |
+-------+--------+------+---------+-------------+----------+---------------------+
|2021 |1 |22 |0 |2021_1_22 0 |1611273600|2021-01-22 02:00:00.0|
|2021 |1 |22 |9 |2021_1_22 9 |1611306000|2021-01-22 11:00:00.0|
|2021 |1 |22 |4 |2021_1_22 4 |1611288000|2021-01-22 06:00:00.0|
|2021 |1 |22 |8 |2021_1_22 8 |1611302400|2021-01-22 10:00:00.0|
|2021 |1 |22 |23 |2021_1_22 23 |1611356400|2021-01-23 01:00:00.0|
|2021 |1 |22 |7 |2021_1_22 7 |1611298800|2021-01-22 09:00:00.0|
|2021 |1 |22 |10 |2021_1_22 10 |1611309600|2021-01-22 12:00:00.0|
|2021 |1 |22 |4 |2021_1_22 4 |1611288000|2021-01-22 06:00:00.0|
|2021 |1 |22 |8 |2021_1_22 8 |1611302400|2021-01-22 10:00:00.0|
|2021 |1 |22 |23 |2021_1_22 23 |1611356400|2021-01-23 01:00:00.0|
+-------+--------+------+---------+-------------+----------+---------------------+
only showing top 10 rows
Now, I need to filter over this new dataframe and get the rows their date_time_EET value is higher than a specific date, e.g. getting the data for the last 10 days. This is how I'm doing it:
N = 10
date_N_days_ago = start_day - timedelta(days=N) # In this example date_N_days_ago will be 2021-01-24
df.filter(col('date_time_EET')> date_N_days_ago)\
.select("year","month","day","time_hour","date_time_EET", )\
.dropDuplicates(["ts_year","ts_month","ts_day","time_hour"])\
.orderBy("ts_year","ts_month","ts_day","time_hour",)\
.show(100, False)
which gives me the following result:
+-------+--------+------+---------+---------------------+
|year |month |day |time_hour|date_time_EET |
+-------+--------+------+---------+---------------------+
|2021 |1 |23 |21 |2021-01-23 23:00:00.0|
|2021 |1 |23 |22 |2021-01-24 00:00:00.0|
|2021 |1 |23 |23 |2021-01-24 01:00:00.0|
|2021 |1 |24 |0 |2021-01-24 02:00:00.0|
|2021 |1 |24 |1 |2021-01-24 03:00:00.0|
|2021 |1 |24 |2 |2021-01-24 04:00:00.0|
|2021 |1 |24 |3 |2021-01-24 05:00:00.0|
|2021 |1 |24 |4 |2021-01-24 06:00:00.0|
|2021 |1 |24 |5 |2021-01-24 07:00:00.0|
|2021 |1 |24 |6 |2021-01-24 08:00:00.0|
|2021 |1 |24 |7 |2021-01-24 09:00:00.0|
|2021 |1 |24 |8 |2021-01-24 10:00:00.0|
...
Why do I get the first row where date_time_EET is "2021-01-23 23:00:00.0", while date_N_days_ago is 2021-01-24? Also, when I add another condition to the filter to returns dates before the start_date it doesn't return the data for the last hour (2021-02-02 23:00:00.0 is missing):
df.filter((col('date_time_EET')> date_N_days_ago) & (col('date_time_EET') <= start_date))\
.select("year","month","day","time_hour","date_time_EET", )\
.dropDuplicates(["ts_year","ts_month","ts_day","time_hour"])\
.orderBy("ts_year","ts_month","ts_day","time_hour",)\
.show(100, False)
+-------+--------+------+---------+---------------------+
|year |month |day |time_hour|date_time_EET |
+-------+--------+------+---------+---------------------+
....
|2021 |2 |2 |16 |2021-02-02 18:00:00.0|
|2021 |2 |2 |17 |2021-02-02 19:00:00.0|
|2021 |2 |2 |18 |2021-02-02 20:00:00.0|
|2021 |2 |2 |19 |2021-02-02 21:00:00.0|
|2021 |2 |2 |20 |2021-02-02 22:00:00.0|
+-------+--------+------+---------+---------------------+
I think you need to convert the time to EET timezone before doing the comparison:
import pytz
date_N_days_ago = (start_day - timedelta(days=N)).astimezone(pytz.timezone('EET'))

Write SQL Server query

Help me please with query:
select *
from sc84 as nom
join sc319 as p on p.PARENTEXT = nom.id
join sc219 as pt on p.sp327 = pt.id
join _1SCONST as c on c.objid=p.id
As a result approximately such table
Car / price_base / 08-08-2016:13-40 / 100 /
Car / price_base / 08-08-2016:14-40 / 150 /
Car / price_base / 08-09-2016:13-40 / 190 /
Car / price_super / 08-09-2016:18-40 / 210 /
Car / price_super / 08-10-2016:13-40 / 290 /
I want to return
Car / price_base / 08-09-2016:13-40 / 190 /
Car / price_super / 08-10-2016:13-40 / 290 /
That is good, types of the price of date and their value. Prompt please how to get the last (the current price for each type of the price and each goods) tried options with group but there is obviously not enough skill.
nom.id - PK SKU
pt.id - PK price type
p.id -PK price
p.parentext - parent price (sku)
p.sp327 - FK to price type
date = date column
I am using SQL Server 2008.
Table structure
T=1SCONST |
----Columns-------
Name |Descr |Type|Length|Precision
F=ROW_ID |Row ID |I |0 |0
F=OBJID |ID obj(0-cons ) |C |9 |0
F=ID |ID parameter |I |0 |0
F=DATE |Fix date |D |0 |0
F=VALUE |valume |V |255 |0
F=DOCID |ID Document |C |9 |0
F=TIME |Time |I |0 |0
F=ACTNO |Action No |I |0 |0
F=LINENO_ |LineNo |S |0 |0
F=TVALUE | |C |3 |0
T=SC319 |
----Columns-------
Name |Descr |Type|Length|Precision
F=ROW_ID |Row ID |I |0 |0
F=ID |ID object |C |9 |0
F=PARENTEXT |Parent in other tabl|C |9 |0
F=ISMARK |Object is Marked for|L |0 |0
F=VERSTAMP |Version stamp |I |0 |0
F=SP6681 |(P)Âàëþòà |C |9 |0
F=SP6682 |(P)Ïðîöåíò |N |5 |2
F=SP327 |(P)ÒèïÖåí |C |9 |0
T=SC319 |Ñïðàâî÷íèê Öåíû |SC319 |R
# ----Columns-------
# Name |Descr |Type|Length|Precision
F=ROW_ID |Row ID |I |0 |0
F=ID |ID object |C |9 |0
F=PARENTEXT |Parent in other tabl|C |9 |0
F=ISMARK |Object is Marked for|L |0 |0
F=VERSTAMP |Version stamp |I |0 |0
F=SP6681 |(P)Âàëþòà |C |9 |0
F=SP6682 |(P)Ïðîöåíò |N |5 |2
F=SP327 |(P)ÒèïÖåí |C |9 |0
#==TABLE no 24 : Ñïðàâî÷íèê Íîìåíêëàòóðà
# Name |Descr |SQLTableNam|RecordLock
T=SC84 |Ñïðàâî÷íèê Íîìåíêëàòóðà |SC84 |R
# ----Columns-------
# Name |Descr |Type|Length|Precision
F=ROW_ID |Row ID |I |0 |0
F=ID |ID object |C |9 |0
F=PARENTID |ID parent obj |C |9 |0
F=CODE |object code |C |8 |0
F=DESCR |object description |C |70 |0
F=ISFOLDER |Is Line - Folder |Y |0 |0
F=ISMARK |Object is Marked for|L |0 |0
F=VERSTAMP |Version stamp |I |0 |0
F=SP85 |(P)Àðòèêóë |C |25 |0
F=SP86 |(P)ÁàçîâàÿÅäèíèöà |C |9 |0
F=SP208 |(P)Âåñîâîé |N |1 |0
F=SP2417 |(P)ÂèäÍîìåíêëàòóðû |C |9 |0
F=SP97 |(P)ÌèíÎñòàòîê |N |13 |3
F=SP5066 |(P)ÍåÂêëþ÷àòüÂïðàéñ |N |1 |0
F=SP5013 |(P)ÍîìåðÃÒÄ |C |9 |0
F=SP94 |(P)ÎñíîâíàÿÅäèíèöà |C |9 |0
F=SP4427 |(P)ÎñíîâíîåÑâîéñòâî |C |9 |0
F=SP103 |(P)ÑòàâêàÍÄÑ |C |9 |0
F=SP104 |(P)ÑòàâêàÍÏ |C |9 |0
F=SP5012 |(P)ÑòðàíàÏðîèñõîæäåí|C |9 |0
F=SP8574 |(P)ÍèçÎòêëîíåíèÿÑåáå|N |4 |2
F=SP8575 |(P)ÂåðõÎòêëîíåíèÿÑåá|N |4 |2
F=SP8576 |(P)ÍèçÎòêëîíåíèÿÑïåö|N |4 |2
F=SP8577 |(P)ÂåðõÎòêëîíåíèÿÑïå|N |4 |2
F=SP8578 |(P)ÍèçÎòêëîíåíèÿÇàêó|N |4 |2
F=SP8579 |(P)ÂåðõÎòêëîíåíèÿÇàê|N |4 |2
F=SP8580 |(P)ÍèçÎòêëîíåíèÿÐîçí|N |4 |2
F=SP8581 |(P)ÂåðõÎòêëîíåíèÿÐîç|N |4 |2
F=SP8599 |(P)Äëèíà |N |6 |2
F=SP8600 |(P)Øèðèíà |N |6 |2
F=SP8601 |(P)Âûñîòà |N |6 |2
F=SP8602 |(P)Îáúåì |N |14 |5
F=SP8606 |(P)ÌàêñèìàëüíûéÏðîöå|N |4 |2
F=SP8607 |(P)Àäðåñ |C |9 |0
F=SP95 |(P)Êîììåíòàðèé |M |0 |0
F=SP101 |(P)ÏîëíÍàèìåíîâàíèå |M |0 |0
select * from (
select *, ROW_NUMBER() over(partition by nom.id,pt.id order by c.date desc) NUM
from sc84 as nom
join sc319 as p
on p.PARENTEXT = nom.id
join sc219 as pt
on p.sp327 = pt.id
join _1SCONST as c
on c.objid=p.id
) A
where NUM=1

SQL: Need to SUM column for each type

How can I find the SUM of all scores for the minimum date of each lesson_id please:
-----------------------------------------------------------
|id |uid |group_id |lesson_id |game_id |score |date |
-----------------------------------------------------------
|1 |145 |1 |1 |0 |40 |1391627323 |
|2 |145 |1 |1 |0 |80 |1391627567 |
|3 |145 |1 |2 |0 |40 |1391627323 |
|4 |145 |1 |3 |0 |30 |1391627323 |
|5 |145 |1 |3 |0 |90 |1391627567 |
|6 |145 |1 |4 |0 |20 |1391628000 |
|7 |145 |1 |5 |0 |35 |1391628000 |
-----------------------------------------------------------
I need output:
-------------------
|sum_first_scores |
-------------------
|165 |
-------------------
I have this so far, which lists the score for each minimum date, per lesson, but I need to sum those results as above:
SELECT lesson_id, MIN(date), score AS first_score FROM cdu_user_progress
WHERE cdu_user_progress.uid = 145
GROUP BY lesson_id
You can identify the first score as the one where no earlier record exists. Then just take the sum:
select sum(score)
from edu_user_progress eup
where cdu_user_progress.uid = 145 and
not exists (select 1
from edu_user_progress eup2
where eup2.uid = eup.uid and
eup2.lesson_id = eup.lesson_id and
eup2.date < eup.date
);
This assumes that the minimum date for the lesson id has only one score.

SQL - get rows where column is greater than certain amount

I need to get the sum of the scores for the first of each lesson_id, but I also need the overall min and max scores for all lesson_ids as well as some other info:
cdu_groups:
----------------
|id |name |
----------------
|1 |group_1 |
|2 |group_2 |
----------------
cdu_user_progress145:
-----------------------------------------------------------
|id |uid |group_id |lesson_id |game_id |score |date |
-----------------------------------------------------------
|1 |145 |1 |1 |0 |40 |1391627323 |
|2 |145 |1 |1 |0 |80 |1391627567 |
|3 |145 |1 |2 |0 |40 |1391627323 |
|4 |145 |1 |3 |0 |30 |1391627323 |
|5 |145 |1 |3 |0 |90 |1391627567 |
|6 |145 |1 |4 |0 |20 |1391627323 |
|7 |145 |1 |5 |0 |35 |1391627323 |
-----------------------------------------------------------
I need this output:
-----------------------------------------------------------------
|name |group_id |min_score |max_score |... |sum_first_scores |
-----------------------------------------------------------------
|group_1 |1 |20 |90 |... |165 |
-----------------------------------------------------------------
SELECT
cdu_groups.*,
MAX(score) AS max_score,
MIN(score) AS min_score,
COUNT(DISTINCT(lesson_id)) AS scored_lesson_count,
COUNT(DISTINCT CASE WHEN score >= 75 then lesson_Id ELSE NULL END) as passed_lesson_count,
SUM(first_scores.first_score) AS sum_first_scores
FROM cdu_user_progress
JOIN cdu_groups ON cdu_groups.id = cdu_user_progress.group_id
JOIN
(
SELECT lesson_id, MIN(date), score AS first_score FROM cdu_user_progress
WHERE cdu_user_progress.uid = 145
GROUP BY lesson_id
) AS first_scores ON first_scores.lesson_id = cdu_user_progress.lesson_id
WHERE cdu_user_progress.uid = 145
I'm getting this error though:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SUM(first_scores.first_score) AS sum_first_scores FROM cdu_user_progress ' at line 7