SQL query for percentage calculation - single column, all data, using like/wildcard - sql

I'm looking for SQL query that will give me a simple percentage value based upon the number of occurrences of a value in a table with a single data column.
Example:
Table has single column of data, which has a header and 10 data rows:
COLUMN_HEADER
XYZ://abc123xyz456-0
XYZ://abc123xyz456-1
XYZ://abc123xyz456-2
XYZ://abc123xyz456-3
ABC://abc123xyz456-4
XYZ://abc123xyz456-5
XYZ://abc123xyz456-6
ABC://abc123xyz456-7
XYZ://abc123xyz456-8
XYZ://abc123xyz456-9
I'm looking for the query to look for all data that does not start with XYZ://*
and give that as a % of the row count.
In the above example, there are two rows that start with ABC:// and eight that start XYZ:// therefore the result should be:
80.00%
(so 8 out of 10 rows do not start with XYZ://)
As you can tell by now I'm a noob in SQL.
MS SQL 2014
Thanks in advance.

You can do this with conditional aggregation:
select avg(case when COLUMN_HEADER like 'XYZ://%' then 1.0 else 0 end) as xyz_ratio
Your logic and examples are backwards. 80% of the rows have values that do start with "XYZ://". Use like or not like as appropriate.

Related

Getting another column from same row which has first non-null value in column

I have a SQL table like this and I want to find the average adjusted amt for products partitioned by store_id that looks like this
Here, I need to compute the adj_amt which is the product of the previous two columns.
For this, I need to fill the nulls in the avg_quantity with the first non_null value in the partition. The query I use is below.
select
CASE WHEN av_quantity is null then
# the boolen here is for non-null values
first_value(av_quantity, True) over (partition by store_no order by product_id
range between current row and unbounded following
)
else av_quantity
end as adj_av_quantity
I'm having trouble with the SQL required to get the adjusted cost, since its not pulling the first non_null value for factor but still fetches it based on the same row for the adj_av_quantity. any thoughts on how I could do this?
FYI I've simplified the data here. The actual dataset is pretty huge (> 125 million rows with 800+ columns) so I won't be able to use joins and have to do this via window functions. I'm using spark-sql

How to build query where 2 fields are inter dependent in MSSQL

I am new to the world of SQL queries, I am comfortable writing basic sql queries limited to to CRUD operations.
I am now working on a project where I have to write complex queries and I am seeking help on how to do it.
Scenario
I have a table x
The logic I need to implement is
The first record starts with some default value let us say 0 as StartCount.
I need to add numbers Add1+Add2 and deduct Minus
The result of step 2+StartCount becomes my EndCount
The next Month StartCount is the EndCount of the previous row.
I have to repeat step 2,3,4 for all the rows in the table.
How can I do this using SQL
You want a cumulative sum, is available using window/analytic functions. It is something like this:
select x.*,
(first_value(startcount) over (order by <ordercol>) +
sum(add1 + add2 - minus) over (order by <ordercol>)
) as yourvalue
from x;
<ordercol> is needed because SQL tables represent unordered sets. You need a column that specifies the ordering of the rows.

Countif query in access

I am trying to run a query that calculate with a countif function but I am having trouble with it. I have used the count and the iif functions in the builder but I think something weird is going on. I am trying to count the number of times a certain value occurs in a column so I do not want a specific value to equal to if that's possible?
Thanks!
To count the number of times a value appears you can use something like.
If you want to know how many times each value appears just omit the WHERE clause (without a sample of data I've used a table in the database I'm working on).
SELECT ProcessID,
COUNT(ProcessID)
FROM tbl_PrimaryData_Step1
WHERE ProcessID = 4
GROUP BY ProcessID
if you need just the value you can use:
SELECT COUNT(ProcessID)
FROM tbl_PrimaryData_Step1
WHERE ProcessID = 4
GROUP BY ProcessID
Another way is:
SELECT DCOUNT("ProcessID","tbl_PrimaryData_Step1","ProcessID = 4")
Edit:
In reply to your comment on your original post this SQL will give the result you're after:
SELECT Concatenate,
COUNT(Concatenate)
FROM MyTable
GROUP BY Concatenate

Apply a single case statement to all columns in sql

I need to get the sum of each column of my table. So i used select sum(col1),col2 etc.
If the sum is null, i need to get 0, else the value of the sum. So I used "select case when sum(col1) is null then 0 else sum(col1) end as sum_col1".
I have around 40 such columns in my table. Do i need to write " case when sum(col n) then..." 40 times in my query?
Im working on oracle 9 g.
Thanks
I have around 40 such columns in my table. Do i need to write " case
when sum(col n) then..." 40 times in my query?
Short answer: Yes.
Longer answer: You might be able to use some kind of dynamic SQL to generate the statement automatically from the column metadata. But it might not be worth the trouble, as you can often just as easily copy-paste the statement in your query editor. All things considered, having a table with 40 columns that you need to sum, indicates a bad data model design. When working with a badly designed data model, you pay the price at query time...

MS SQL 2000 - How to efficiently walk through a set of previous records and process them in groups. Large table

I'd like to consult one thing. I have table in DB. It has 2 columns and looks like this:
Name...bilance
Jane...+3
Jane...-5
Jane...0
Jane...-8
Jane...-2
Paul...-1
Paul...2
Paul....9
Paul...1
...
I have to walk through this table and if I find record with different "name" (than was on previous row) I process all rows with the previous "name". (If I step on the first Paul row I process all Jane rows)
The processing goes like this:
Now I work only with Jane records and walk through them one by one. On each record I stop and compare it with all previous Jane rows one by one.
The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs
Summary:
I loop through this table in 3 levels paralelly (nested loops)
1st level = search for changes of "name" column
2nd level = if change was found, get all rows with previous "name" and walk through them
3rd level = on each row stop and walk through all previous rows with current "name"
Can this be solved only using CURSOR and FETCHING, or is there some smoother solution?
My real table has 30 000 rows and 1500 people and If I do the logic in PHP, it takes long minutes and than timeouts. So I would like to rewrite it to MS SQL 2000 (no other DB is allowed). Are cursors fast solution or is it better to use something else?
Thank you for your opinions.
UPDATE:
There are lots of questions about my "summarization". Problem is a little bit more difficult than I explained. I simplified it just to describe my algorithm.
Each row of my table contains much more columns. The most important is month. That's why there are more rows for each person. Each is for different month.
"Bilances" are "working overtimes" and "arrear hours" of workers. And I need to sumarize + and - bilances to neutralize them using values from previous months. I want to have as many zeroes as possible. All the table must stay as it is, just bilances must be changed to zeroes.
Example:
Row (Jane -5) will be summarized with row (Jane +3). Instead of 3 I will get 0 and instead of -5 I will get -2. Because I used this -5 to reduce +3.
Next row (Jane 0) won't be affected
Next row (Jane -8) can not be used, because all previous bilances are negative
etc.
You can sum all the values per name using a single SQL statement:
select
name,
sum(bilance) as bilance_sum
from
my_table
group by
name
order by
name
On the face of it, it sounds like this should do what you want:
select Name, sum(bilance)
from table
group by Name
order by Name
If not, you might need to elaborate on how the Names are sorted and what you mean by "summarize".
I'm not sure what you mean by this line... "The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs".
But, it may be possible to use a group by query to get a lot of what you need.
select name, case when bilance < 0 then 'negative' when bilance >= 0 then 'positive', count(*)
from table
group by name, bilance
That might not be perfect syntax for the case statement, but it should get you really close.