Using Multiple ANDs and ORs in ANSI SQL - sql

I have a simple SQL query:
SELECT
w.fizz
FROM
widgets w
WHERE
w.special_id = 2394
AND w.buzz IS NOT NULL
AND w.foo = 12
In pseudo-code, this WHERE clause could be thought of as:
if(specialId == 2394 && buzz != null && foo == 12)
I now want to change this query so that it returns all widgets whose special_id is 2394, and whose buzz is not null, and whose foo is 12, OR whose special_id is 2394, and whose blah is 'YES', and whose num is 4. In pseudo-code:
if(specialId == 2394 && (buzz != null && foo == 12) || (blah == "YES" && num == 4))
I tried the following, only to get errors:
SELECT
w.fizz
FROM
widgets w
WHERE
w.special_id = 2394
AND
(
w.buzz IS NOT NULL
AND w.foo = 12
)
OR
(
w.blah = 'YES'
AND w.num = 4
)
Any ideas? Thanks in advance!

SELECT
w.fizz
FROM
widgets w
WHERE
w.special_id = 2394
AND
(
(
w.buzz != null
AND w.foo = 12
)
OR
(
w.blah = 'YES'
AND w.num = 4
)
)
Add additional brackets surrounding "OR", because "OR" has less priority than "AND".

Related

Cannot cast result from Sub-Select to numeric

I am selecting a list of IDs as a sub-query in a condition but it says it cannot convert '123,456' to numeric. The problem occurs in the last line. DB is Sybase-SQL-Anywhere.
SELECT
ISNULL(SUM(a.menge), 0) AS menge,
ISNULL(SUM(a.wert), 0) AS wert
FROM admin.p_ws_ix_kontrakte_ernte_auswertung_jensek a
WHERE
(a.KtrErnteJahr = ? OR ? IS NULL)
AND (
(a.KtrDispoKennz >= ? OR ? IS NULL)
AND
(a.KtrDispoKennz <= ? OR ? IS NULL)
)
AND a.artikelstammid IN ((SELECT LIST(artikelstammId) FROM admin.ws_ix_auswertung_cfg_spalten_artikel WHERE columnId = $column))
Remove the LIST():
# replace this:
AND a.artikelstammid IN ((SELECT LIST(artikelstammId) FROM admin.ws_ix_auswertung_cfg_spalten_artikel WHERE columnId = $column))
# with this:
AND a.artikelstammid IN (SELECT artikelstammId FROM admin.ws_ix_auswertung_cfg_spalten_artikel WHERE columnId = $column)
Another option would be an exists/correlated subquery:
# replace this:
AND a.artikelstammid IN ((SELECT LIST(artikelstammId) FROM admin.ws_ix_auswertung_cfg_spalten_artikel WHERE columnId = $column))
# with this:
AND exists (SELECT 1 FROM admin.ws_ix_auswertung_cfg_spalten_artikel b WHERE b.columnId = $column and b.artikelstammId = a.artikelstammid)

Power BI DAX - find repeatability

Given data as such:
Month ValueA
1 T
2 T
3 T
4 F
Is there a way to make a measure that would find if for each month, last three Values were True?
So the output would be (F,F,T,F)?
That would propably mean that my actual problem is solvable, which is finding from:
Month ValueA ValueB ValueC
1 T F T
2 T T T
3 T T T
4 F T F
the count of those booleans for each row, so the output would be (0,0,2[A and C],1[B])
EDIT:
Okay, I managed to solve the first part with this:
Previous =
VAR PreviousDate =
MAXX(
FILTER(
ALL( 'Table' ),
EARLIER( 'Table'[Month] ) > 'Table'[Month]
),
'Table'[Month]
)
VAR PreviousDate2 =
MAXX(
FILTER(
ALL( 'Table' ),
EARLIER( 'Table'[Month] ) - 1 > 'Table'[Month]
),
'Table'[Month]
)
RETURN
IF(
CALCULATE(
MAX( 'Table'[Value] ),
FILTER(
'Table',
'Table'[Month] = PreviousDate
)
) = "T"
&& CALCULATE(
MAX( 'Table'[Value] ),
FILTER(
'Table',
'Table'[Month] = PreviousDate2
)
) = "T"
&& 'Table'[Value] = "T",
TRUE,
FALSE
)
But is there a way to use it with unknown number of columns?
Without hard - coding every column name? Like a loop or something.
I would redo the data table in power query (upivoting the ValueX-columns) and changing T/F to 1/0. Then have a dim table with a relationship to Month, like this:
Then add a measure like this:
Three Consec T =
var maxMonth = MAX('Data'[Month])
var tempTab =
FILTER(
dimMonth;
'dimMonth'[MonthNumber] <= maxMonth && 'dimMonth'[MonthNumber] > maxMonth -3
)
var sumMonth =
MAXX(
'dimMonth';
CALCULATE(
SUM('Data'[OneOrZero]);
tempTab
)
)
return
IF(
sumMonth >= 3;
"3 months in a row";
"No"
)
Then I can have a visual like this when the slicer indicates which time window I'm looking at and the table shows if there has been 3 consecutive Ts or not.

Pandas .loc[] method is too slow, how can I speed it up

I have a dataframe with 40 million rows,and I want to change some colums by
age = data[data['device_name'] == 12]['age'].apply(lambda x : x if x != -1 else max_age)
data.loc[data['device_name'] == 12,'age'] = age
but this method is too slow, how can I speed it up.
Thanks for all reply!
you might wanna change the first part to :
age = data[data['device_name'] == 12]['age']
age[age == -1] = max_age
data.loc[data['device_name'] == 12,'age'] = age
you could use, to me more concise(this could gain you a little speed)
cond = data['device_name'] == 12
age = data.loc[cond, age]
data.loc[cond,'age'] = age.where(age != -1, max_age)

Using GROUP BY/CASE with WHEN or IF

note: edited query below.
I am looking to segment a data set according to two criteria:
If a customer has more or less than 4 txns at a specific restaurant
If a customer has more or less than 24 txns at all the other restaurants in that data set.
I am using a conjunction of GROUP BY, CASE and WHEN or IF. I am not sure which approach is best, if either?
SELECT
COUNT(Customer) AS number_of_customers,
AVG (CASE WHEN ItemPrice LIKE '-%' THEN NULL
WHEN ItemPrice LIKE '0%' THEN NULL
ELSE CAST (ItemPrice AS FLOAT) END) AS avg_item_price,
COUNT(DISTINCT(ReceiptIDDesc)) AS number_of_orders,
SUM(CAST(ItemPrice AS FLOAT)) AS total_spend
FROM Tacos
WHERE NOT (PurchaseDate > '01/01/2016 12:00' OR '03/01/2016 12:00'<
PurchaseDate)
GROUP BY
CASE
WHEN (COUNT('MerchantFamily' = %TacoTruck%)> 2) AND COUNT('MerchantFamily' != %TacoTruck%) >24)
THEN 'Fanatic'
WHEN (COUNT('MerchantFamily' = %TacoTruck%)> 2) AND COUNT('MerchantFamily' != %TacoTruck%) <24)
THEN 'Loyalist'
WHEN (COUNT('MerchantFamily' = %TacoTruck%)< 2) AND COUNT('MerchantFamily' != %TacoTruck%) <24)
THEN 'Seldom'
ELSE
'Potential'
END
OR
GROUP BY
CASE
IF(COUNT(IF( 'MerchantFamily' = 'TacoTruck', 1, 0 ) ) > 2, TRUE, FALSE)
AND
IF(COUNT(IF( 'MerchantFamily' != 'TacoTruck',1, 0) ) < 24, TRUE, FALSE), 'Loyalist', NULL )
IF(COUNT(IF( 'MerchantFamily' = 'TacoTruck', 1, 0 ) ) > 2, TRUE, FALSE)
AND
IF(COUNT(IF( 'MerchantFamily' != 'TacoTruck', 1, 0 ) ) > 24, TRUE, FALSE), 'Fanatic', NULL)
IF(COUNT(IF( 'MerchantFamily' = 'TacoTruck', 1, 0 ) ) < 2, TRUE, FALSE)
AND
IF(COUNT( IF( 'MerchantFamily' != 'TacoTruck', 1, 0 ) ) < 24, TRUE, FALSE), 'Seldom', NULL)
ELSE
'Potential'
END
Neither of those approaches will work, you need to group first then consider the aggregated count values through a having clause, or as a nested subquery ("derived table").
A case expression only evaluates values on a per row basis, it does not scan multiple rows.

Progress DB, need to merge two queries

I have 2 progress database queries and I'm trying to merge them into one statement, but I am getting errors. Each of these queries simply returns a number and I would like to sum those 2 numbers together. Either that or make another query from scratch. They both take in a set of value codes for "DM1" and they both accept 1 "product".
Query 1
SELECT SUM(opn3.samt)
FROM PUB.ord ord3, PUB.opn opn3
WHERE
ord3.subsnum = opn3.subsnum
AND ord3.onum = opn3.onum
AND ord3.DM1 != ''
AND ord3.DM1 IN('XCWAJC25','WCWAMO73')
AND ord3.prdcde = 'CSC'
AND ord3.stat != 16
AND opn3.samt >= 0
GROUP BY ord3.DM1, ord3.prdcde
Query 2
SELECT SUM((-1 * opn2.samt) + ord2.samt)
FROM PUB.ord ord2, PUB.opn opn2
WHERE
ord2.subsnum = opn2.subsnum
AND ord2.onum = opn2.onum
AND ord2.DM1 != ''
AND ord2.DM1 IN('XCWAJC25','WCWAMO73')
AND ord2.prdcde = 'CSC'
AND ord2.stat = 16
AND opn2.samt < 0
GROUP BY ord2.DM1, ord2.prdcde
Merge attempt so far...
SELECT SUM(opn3.samt + (SELECT SUM((-1 * opn2.samt) + ord2.samt)
FROM PUB.ord ord2, PUB.opn opn2
WHERE
ord2.subsnum = opn2.subsnum
AND ord2.onum = opn2.onum
AND ord2.DM1 != ''
AND ord2.DM1 = ord3.DM1
AND ord2.prdcde = ord3.prdcde
AND ord2.stat = 16
AND opn2.samt < 0
GROUP BY ord2.DM1, ord2.prdcde
)) as foo
FROM PUB.ord ord3, PUB.opn opn3
WHERE
ord3.subsnum = opn3.subsnum
AND ord3.onum = opn3.onum
AND ord3.DM1 != ''
AND ord3.DM1 IN('XCWAJC25','WCWAMO73')
AND ord3.prdcde = 'CSC'
AND ord3.stat != 16
AND opn3.samt >= 0
GROUP BY ord3.DM1, ord3.prdcde
Thanks
I think this will work, although it would be nice to have sample data to verify:
SELECT COALESCE(SUM(a.samt), 0) - COALESCE(SUM(b.samt), 0)
+ COALESCE(SUM(CASE WHEN ord.stat = 16
AND b.samt < 0
THEN ord.samt END), 0)
FROM PUB.ord ord
LEFT JOIN PUB.opn a
ON a.subsnum = ord.subsnum
AND a.onum = ord.onum
AND a.samt >= 0
AND ord.stat != 16
LEFT JOIN PUB.opn b
ON b.subsnum = ord.subsnum
AND b.onum = ord.onum
AND b.samt < 0
AND ord.stat = 16
WHERE ord.DM1 IN('XCWAJC25', 'WCWAMO73')
AND ord.prdcde = 'CSC'
GROUP BY ord.DM1
Notes on query/stuff:
Always explicitly qualify joins, don't use the comma-separated FROM clause
I don't think you needed ord.DM1 != '', given that values have to be in a specific set
Putting a clause into a LEFT JOIN condition instead of the WHERE clause has a slightly different effect; it adds the condition to the join, instead of the filtering. This means that rows can be excluded based on something in the left table, regardless of whether you need the actual row (this is why ord.stat ended up in the LEFT JOINs). INNER JOINs would technically behave the same way, but it isn't usually noticeable because causing the right table to be excluded also excludes the left table.
I think this should do the trick, given the the individual queries work as intended:
SELECT sum1.tot + sum2.tot
FROM
(SELECT SUM(opn3.samt) as tot
FROM PUB.ord ord3, PUB.opn opn3
WHERE
ord3.subsnum = opn3.subsnum
AND ord3.onum = opn3.onum
AND ord3.DM1 != ''
AND ord3.DM1 IN('XCWAJC25','WCWAMO73')
AND ord3.prdcde = 'CSC'
AND ord3.stat != 16
AND opn3.samt >= 0
GROUP BY ord3.DM1, ord3.prdcde) sum1,
(SELECT SUM((-1 * opn2.samt) + ord2.samt) as tot
FROM PUB.ord ord2, PUB.opn opn2
WHERE
ord2.subsnum = opn2.subsnum
AND ord2.onum = opn2.onum
AND ord2.DM1 != ''
AND ord2.DM1 IN('XCWAJC25','WCWAMO73')
AND ord2.prdcde = 'CSC'
AND ord2.stat = 16
AND opn2.samt < 0
GROUP BY ord2.DM1, ord2.prdcde) sum2