Select case mess in JFreeChart - sql

I have a Column(cliente_x_hora, a numeric field) i put in a interval and count the number in each interval.I have 3 textfields(number of intervals,value between intervals and initial value). When I select the two first(with 5 intervals and 1000 value), the query run flawless and generate the expect barchart.
Query(with two select textfields):
SELECT INTERVAL, COUNT(*) TOTAL FROM (
SELECT CASE WHEN CLIENTE_X_HORA>0 AND CLIENTE_X_HORA<=1000.00 THEN '0<CLIENTE_X_HORA> <=1000.00'
WHEN CLIENTE_X_HORA>1000.00 AND CLIENTE_X_HORA<=2000.00 THEN '1000.00<CLIENTE_X_HORA><=2000.00'
WHEN CLIENTE_X_HORA>2000.00 AND CLIENTE_X_HORA<=3000.00 THEN '2000.00<CLIENTE_X_HORA><=3000.00'
WHEN CLIENTE_X_HORA>3000.00 AND CLIENTE_X_HORA<=4000.00 THEN '3000.00<CLIENTE_X_HORA><=4000.00'
ELSE '4000.00<CLIENTE_X_HORA' END INTERVAL, CLIENTE_X_HORA FROM SGD_CAUSA)
GROUP BY INTERVAL ORDER BY TOTAL
The barchart is
The problem is when I select the last field(initial value with, per example 2000), my barchart go crazy(i believe is adding up the discarded values below 2000):
That ELSE(>6000) should be much smaller than is showing.How can I solve that?
Best Regards,
DDias
CLARIFICATION from OP:
The query is the same as above but begins in 2000:
SELECT CASE WHEN CLIENTE_X_HORA>2000 AND CLIENTE_X_HORA<=3000.00... and ends in 6000:ELSE '6000.00<CLIENTE_X_HORA' END INTERVAL, CLIENTE_X_HORA FROM SGD_CAUSA) GROUP BY INTERVAL ORDER BY TOTAL
put the result in table form is impractical(we are talking about over 87 thousand rows) That happens always when i give an initial value different than ZERO.

Your ELSE is just that. It includes everything that is not matched by specific WHENs.
So if you do not start from zero, that last column will include everything below a lowest limit in addition to greater than highest limit.
So if you do not want this behavior, do not use ELSE at all. Use WHEN CLIENTE_X_HORA > 6000.00 (or whatever your highest limit is) as the last condition.
EDIT:
In your internal query filter out (with WHERE) the values that are below the lowest limit.
Since we no longer have unneeded low range, you no longer need the HAVING clause we added and you can even go back to using ELSE.
If your lowest limit is zero, then you will be filtering everything below 0, which I assume is nothing.

Related

Finding the last 4, 3, 2, 1 months consecutive order drops among clients based on drop variance

Here I have this query that finds out the drop percentage of a bunch of clients based on the orders they have received(i.e. It finds the percentage difference in orders by comparing the current month with the previous month). What I want to achieve here is to have a field where I can see the clients who had 4 months continuous drop, 3 months drop, 2 months drop, and 1 month drop.
I know, it can only be achieved by comparing the last 4 months using the lag function or sub queries. can you guys pls help me out on this one, would appreciate it very much
select
fd.customers2, fd.Month1, fd.year1, fd.variance, case when
(fd.variance < -0.00001 and fd.year1 = '2022.0' and fd.Month1 = '1')
then '1month drop' else fd.customers2 end as 1_most_host_drop
from 
(SELECT
c.*,
sa.customers as customers2,
sum(sa.order) as orders,
date_part(mon, sa.date) as Month1,
date_part(year, sa.date) as year1,
(cast(orders - LAG(orders) OVER(Partition by customers2 ORDER BY
 year1, Month1) as NUMERIC(10,2))/NULLIF(LAG(orders) 
OVER(partition by customers2 ORDER BY year1, Month1) * 1, 0)) AS variance
FROM stats sa join (select distinct
    d.id, d.customers 
     from configer d 
    ) c on sa.customers=c.customers
WHERE sa.date >= '2021-04-1' 
GROUP BY Month1, sa.customers, c.id,  year1, 
     c.customers)fd
In a spirit of friendliness: I think you are a little premature in posting this here as there are several issues with the syntax before even reaching the point where you can solve the problem:
You have at least two places with a comma immediately preceding the word FROM:
...AS variance, FROM stats_archive sa ...
...d.customers, FROM config d...
Recommend you don't use VARIANCE as an alias (it is a system function in PostgreSQL and so is likely also a system function name in Redshift)
Not super important, but there's no need for c.* - just select the columns you will use
DATE_PART requires a string as the first parameter DATE_PART('mon',current_date)
I might be wrong about this, but I suspect you cannot use column aliases in the partition by or order by of a window function. Put the originating expressions there instead:
... OVER (PARTITION BY customers2 ORDER BY DATE_PART('year',sa.date),DATE_PART('mon',sa.date))
LAG has three parameters. (1) The column you want to retrieve the value from, (2) the row offset, where a positive integer indicates how many rows prior to the current row you should retrieve a value from according to the partition and order context and (3) the value the function should return as a default (in case of the first row in the partition). As such, you don't need NULLIF. So, to get the row immediately prior to the current row, or return 0 in case the current row is the first row in the partition:
LAG(orders,1,0) OVER (PARTITION BY customers2 ORDER BY DATE_PART('year',sa.date),DATE_PART('mon',sa.date))
If you use 0 as a default in the calculation of what is currently aliased variance, you will almost certainly run into a div/0 error either now, or worse, when you least expect it in the future. You should protect against that with some CASE logic or better, provide a more appropriate default value or even better, calculate the LAG with the default 0, then filter out the 0 rows before doing the calculation.
You can't use column aliases in the GROUP BY. You must reference each field that is not participating in an aggregate in the group by, whether through direct mention (sa.date) or indirectly in an expression (DATE_PART('mon',sa.date))
Your date should be '2021-04-01'
All in all, without sample data, expected results using the posted sample data and without first removing syntax errors, it is a tall order to attempt to offer advice on the problem which is any more specific than:
Build the source of the calculation as a completely separate query first. Calculate the LAG in that source query. Only when you've run that source query and verified that the LAG is producing the correct result should you then wrap it as a sub-query or CTE (not sure if Redshift supports these, but presumably) at which point you can filter out the rows with a zero as the denominator (the first month of orders for each customer).
Good luck!

Wrapping a range of data

How would I select a rolling/wrapping* set of rows from a table?
I am trying to select a number of records (per type, 2 or 3) for each day, wrapping when I 'run out'.
Eg.
2018-03-15: YyBiz, ZzCo, AaPlace
2018-03-16: BbLocation, CcStreet, DdInc
These are rendered within a SSRS report for Dynamics CRM, so I can do light post-query operations.
Currently I get to:
2018-03-15: YyBiz, ZzCo
2018-03-16: AaPlace, BbLocation, CcStreet
First, getting a number for each record with:
SELECT name, ROW_NUMBER() OVER (PARTITION BY type ORDER BY name) as RN
FROM table
Within SSRS, I then adjust RN to reflect the number of each type I need:
OnPageNum = FLOOR((RN+num_of_type-1)/num_of_type)-1
--Shift RN to be 0-indexed.
Resulting in AaPlace, BbLocation and CcStreet having a PageNum of 0, DdInc of 1, ... YyBiz and ZzCo of 8.
Then using an SSRS Table/Matrix linked to the dataset, I set the row filter to something like:
RowFilter = MOD(DateNum, NumPages(type)) == OnPageNum
Where DateNum is essentially days since epoch, and each page has a separate table and day passed in.
At this point, it is showing only N records of type per page, but if the total number of records of a type isn't a multiple of the number of records per page of that type, there will pages with less records than required.
Is there an easier way to approach this/what's the next step?
*Wrapping such as Wraparound found in videogames, seamless resetting to 0.
To achieve this effect, I found that offsetting the RowNumber by -DateNum*num_of_type (negative for positive ordering), then modulo COUNT(type) would provide the correct "wrap around" effect.
In order to achieve the desired pagination, it then just had to be divided by num_of_type and floor'd, as below:
RowFilter: FLOOR(((RN-DateNum*num_of_type) % count(type))/num_of_type) == 0

Select Average of Top 25% of Values in SQL

I'm currently writing a stored procedure for my client to populate some tables that will be used to generate SSRS reports later on. Some of the data is based on specific stock formulas that are run on each of their clients' quarterly data (sent to them by their clients). The other part of the data is generated by comparing those results against those from other, similar sized clients. One of the things that they want tracked in their reports is the average of the top 25% of formula results for that particular comparison group.
To give a better picture of it, imagine the following fields that I have in a temp table:
FormulaID int
Value decimal (18,6)
I want to do the following: Given a specific FormulaID return the average of the top 25% of Value.
I know how to take an average in SQL, but I don't know how to do it against only the top 25% of a specific group.
How would I write this query?
I guess you can do something like this...
SELECT AVG(Q.ColA) Avg25Prec
FROM (
SELECT TOP 25 Percent ColA
FROM Table_Name
ORDER BY SomeCOlumn
) Q
Here's what I did, given the table shown above:
select AVG(t.Value)
from (select top 25 percent Value
from #TempGroupTable
where FormulaID = #PassedInFormulaID
order by Value desc) as t
The desc must be there, because the percent command will not actually do comparisons. It will just simply grab the first x number of records, with x being equal to 25% of the count of records it's querying. Therefore, the order by Value desc line then will grab the top 25% records which have the highest Value, and then sends that info to be averaged.
As a side note to all of this, this also means that if you wanted to grab the bottom 25% instead, or if your formula results are like a golf score (i.e. lowest is the best), all you would need to do is remove the desc part and you would be good to go.

Oracle SQL find minumum value above a certain threshold

Had a quick google to see if this can be done without much luck, but is there any way in oracle sql to return the minumum value of something but above a certain number (i.e. minimum value above negative numbers). Currently I'm using this line of code
min(ROUND(IA.ASM_START_DATE -REF.ASM_START_DATE,0)) over (partition by IA.ASM_ID) min_wk
To return the lowest difference grouped by ID - it's working to a point, but I want it to bring back the lowest difference above -10. Ideally I'm trying to achieve this in the select rather than using the where query, as I want to use it to identify issues but not exclude them from the report completely.
A simple hack is to use a case statement to set any values that are too low to null so they won't change the minimum:
min(case when ROUND(IA.ASM_START_DATE -REF.ASM_START_DATE,0)<-10 then null else ROUND(IA.ASM_START_DATE -REF.ASM_START_DATE,0) end) over (partition by IA.ASM_ID)

MS SQL 2000 - How to efficiently walk through a set of previous records and process them in groups. Large table

I'd like to consult one thing. I have table in DB. It has 2 columns and looks like this:
Name...bilance
Jane...+3
Jane...-5
Jane...0
Jane...-8
Jane...-2
Paul...-1
Paul...2
Paul....9
Paul...1
...
I have to walk through this table and if I find record with different "name" (than was on previous row) I process all rows with the previous "name". (If I step on the first Paul row I process all Jane rows)
The processing goes like this:
Now I work only with Jane records and walk through them one by one. On each record I stop and compare it with all previous Jane rows one by one.
The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs
Summary:
I loop through this table in 3 levels paralelly (nested loops)
1st level = search for changes of "name" column
2nd level = if change was found, get all rows with previous "name" and walk through them
3rd level = on each row stop and walk through all previous rows with current "name"
Can this be solved only using CURSOR and FETCHING, or is there some smoother solution?
My real table has 30 000 rows and 1500 people and If I do the logic in PHP, it takes long minutes and than timeouts. So I would like to rewrite it to MS SQL 2000 (no other DB is allowed). Are cursors fast solution or is it better to use something else?
Thank you for your opinions.
UPDATE:
There are lots of questions about my "summarization". Problem is a little bit more difficult than I explained. I simplified it just to describe my algorithm.
Each row of my table contains much more columns. The most important is month. That's why there are more rows for each person. Each is for different month.
"Bilances" are "working overtimes" and "arrear hours" of workers. And I need to sumarize + and - bilances to neutralize them using values from previous months. I want to have as many zeroes as possible. All the table must stay as it is, just bilances must be changed to zeroes.
Example:
Row (Jane -5) will be summarized with row (Jane +3). Instead of 3 I will get 0 and instead of -5 I will get -2. Because I used this -5 to reduce +3.
Next row (Jane 0) won't be affected
Next row (Jane -8) can not be used, because all previous bilances are negative
etc.
You can sum all the values per name using a single SQL statement:
select
name,
sum(bilance) as bilance_sum
from
my_table
group by
name
order by
name
On the face of it, it sounds like this should do what you want:
select Name, sum(bilance)
from table
group by Name
order by Name
If not, you might need to elaborate on how the Names are sorted and what you mean by "summarize".
I'm not sure what you mean by this line... "The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs".
But, it may be possible to use a group by query to get a lot of what you need.
select name, case when bilance < 0 then 'negative' when bilance >= 0 then 'positive', count(*)
from table
group by name, bilance
That might not be perfect syntax for the case statement, but it should get you really close.