How to fix this column doesn't exist error in SQL? - sql

In the sales table, three columns are btl_price, bottle_qty, and total. The total for a transaction should be the product of btl_price and bottle_qty. How many transactions have a value of total that is not equal to btl_price times bottle_qty?
Here is the table:
Here are my codes:
sql = """
Select (btl_price*bottle_qty) As total_sale, CAST(total AS money)
From sales
Where total != total_sale
"""
It keeps telling me "column "total_sale" does not exist".
Please help me to identify my mistakes.
PS: I code this in Jupyter Notebook. This is a practice of mine not in any DBMS.

You cannot use columns computed in the SELECT clause in the WHERE clause (in SQL, the matter is evaluated before the former).
Also, you need proper type casting to compare money and numbers.
Finally, you need to turn on aggregation to compute the number of sales that satisfy the condition.
Assuming that you are using Postgres, that would be:
select count(*)
from sales
where total::numeric <> btl_price::numeric * btl_quantity

Try this:
SELECT *
FROM sales
WHERE total !=(btl_price * bottle_qty)
Good luck

Related

Sum of two columns as criteria

I am trying to match product entries by their surface.
My thought was that the below query should be valid.
But it doesn't work, I am receiving:
Unknown column 'surface' in 'where clause'
SELECT SUM(width*height) AS surface FROM products WHERE surface>50
The function sum is not for this use case. With sum you get the sum of all rows. What you are looking for is.
SELECT (width*height) AS surface FROM products WHERE surface>50
This works:
SELECT * FROM products WHERE (width*height)>50

how to calculate prevalence using sql code

I am trying to calculate prevalence in sql.
kind of stuck in writing the code.
I want to make automative code.
I have check that I have 1453477 of sample size and number of people who has disease is 851451 using count.
The formula of calculating prevalence is no.of person who has disease/no.sample size.
select (COUNT(condition_id)/COUNT(person_id)) as prevalence
from disease
where condition_id=12345;
when I run above code, I get 1 as a output where I am suppose to get 0.5858.
Can some one please help me out?
Thanks!
In your current query you count the number of rows in the disease table, once using the column condition_id, once using the column person_id. But the number of rows is the same - this is why you get 1 as a result.
I think you need to find the number of different values for these columns. This can be done using count distinct:
select (COUNT(DISTINCT condition_id)/COUNT(DISTINCT person_id)) as prevalence
from disease
where condition_id=12345;
You can cast by
count(...)/count(...)::numeric(6,4) or
count(...)/count(...)::decimal
as two options.
Important point is apply cast to denominator or numerator part(in this case denominator), Do not apply to division as
(count(...)/count(...))::numeric(6,4) which again results an integer.
I am pretty sure that the logic that you want is something like this:
select avg( (condition_id = 12345)::int )
from disease;
Your version doesn't have the sample size, because you are filtering out people without the condition.
If you have duplicate people in the data, then this is a little more complicated. One method is:
select (count(distinct person_id) filter (where condition_id = 12345)::numeric /
count(distinct person_id
)
from disease;

SQL Aggregate Function over partitions

I'm relatively new to SQL but have learned some cool stuff. I'm getting results that don't make sense. I've got a query with several subqueries and what-not but I have a windowed function that isn't working like I'm expecting.
The part that isn't working is this (simplified from the 300 line query):
SELECT AVG(table.sales_amount)
OVER (PARTITION BY table.month, table.sales_rep, table.department)
FROM table
The problem is that when I pull the data non aggregated I get a value different (107) than the above returns (95).
I've used windowed functions for COUNT and SUM and they work fine, but AVG is acting strangely. Am I missing something about how this works with AVG?
The subquery that table is a standin for looks like:
sales_rep, month, department, sales_amount
1, 2017-1, abc, 125.20
1, 2017-2, abc, 120.00
2, 2017-1, def, 100.00
...etc
Working out of Sql Server Management studio
SOLVED: I did finally figure it out, the results i was joining this subquery to had the sales rep multiple times in a month selling objects A&B which caused whoever sold both to be counted twice. whoops, my bad.
The results that you get should be the same values as in:
SELECT AVG(table.sales_amount)
FROM table
GROUP BY table.month, table.sales_rep, table.department;
Of course, the rows will be different. You need to match up the three key columns.
Based on your sample data, it looks like the partitioning keys uniquely define each row. Perhaps you really intend:
SELECT AVG(table.sales_amount) OVER () as overall_average
FROM table;
EDIT:
For the departmental average:
SELECT AVG(table.sales_amount) OVER (partition by table.department) as department_average
FROM table;
After some bruteforcing of potential errors I finally figured out the issue. I was joining that subquery to the another which had multiple instances of a sales_rep in a given month (selling objects a & b) which caused the average of those with sales of both objects to be counted twice instead of once.
so sales rep 1 sold objects a & b which made his avg count as 66% of the dept avg instead of 50%, and sales rep 2 count only 33%.

SQL: How to use sum in group by

SELECT idteam,
job,
price,
COUNT('X') as INFORMS,
SUM(COUNT('X') * price) as TOTAL
FROM REP
JOIN COSTS ON (job = categ AND to_number(to_char(REP,'YYYY')) = year)
GROUP BY idteam, job, price, TOTAL
ORDER BY IDTEAM;
I don't know why if I write TOTAL in GROUP BY and sql sends me error.. Identifier invalid.
I don't know how can I resolve that.
Thanks.
The column "TOTAL" is an alias for SUM(COUNT('X') * price).
It cannot be used as a column identifier in the GROUP BY clause. You must say GROUP BY SUM(COUNT('X') * price), because "TOTAL" is unknown/not a column, at the time of grouping.
After using GROUPING, you can refer to "TOTAL" in a HAVING clause.
In any case, the version/type of SQL your are using, doesn't allow it.
Additionally, why are you COUNTing 'X'? That X is a fixed value, and does not depend on any of your columns. If you would like to count each row, just use Count(1) or Count(*). Also, you don't need to SUM a COUNT. A COUNT is already summed.
You should post the structure of both REP and COSTS. Your linked image doesn't have enough info to support the query you wrote.
select
idteam,
-- job, /* not selected since it would need to be grouped*/
sum(price) as 'theSUM'
from REP
join COSTS
on REP.categ = COSTS.job
and COSTS.year = 2016
group by idteam
order by idteam

Query cannot be parse SELECT avg_galoon, avg_galoon

I dont know if its the right way to multiply that way, I want to multiply the avg_galoon that is sold weekly by the price of it which is 20
SELECT avg_galoon,
avg_galoon * 20 + NVL(total income , 0)
FROM customer;
total income isn't a column name from the customer table.
If you disagree and can see that column name (via methods mentioned in the comments already) then you will need to enclose that column name in the appropriate syntax for the database engine you are using i.e. [total income] for Microsofts SQL Server, `total income` for MySQL etc.
In future avoid the use of spaces in column names.