I've been asked to modify a report (which unfortunately was written horribly!! not by me!) to include a count of days. Please note the "Days" is not calculated using "StartDate" & "EndDate" below. The problem is, there are multiple rows per record (users want to see the detail for start & enddate), so my total for "Days" are counting for each row. How can I get the total 1 time without the total in column repeating?
This is what the data looks like right now:
ID Description startdate enddate Days
REA145681 Emergency 11/17/2011 11/19/2011 49
REA145681 Emergency 12/6/2011 12/9/2011 49
REA145681 Emergency 12/10/2011 12/14/2011 49
REA146425 Emergency 11/23/2011 12/8/2011 54
REA146425 Emergency 12/9/2011 12/12/2011 54
I need this:
ID Description startdate enddate Days
REA145681 Emergency 11/17/2011 11/19/2011 49
REA145681 Emergency 12/6/2011 12/9/2011
REA145681 Emergency 12/10/2011 12/14/2011
REA146425 Emergency 11/23/2011 12/8/2011 54
REA146425 Emergency 12/9/2011 12/12/2011
Help please. This is how the users want to see the data.
Thanks in advance!
Liz
--- Here is the query simplified:
select id
,description
,startdate -- users want to see all start dates and enddates
,enddate
,days = datediff(d,Isnull(actualstardate,anticipatedstartdate) ,actualenddate)
from table
As you didn't provide the data of your tables I'll operate over your result as if it was a table. This will result in what you're looking for:
select *,
case row_number() over (partition by id order by id)
when 1 then days
end
from t
Edit:
Looks like you DID added some SQL code. This should be what you're looking for:
select *,
case row_number() over (partition by id order by id)
when 1 then
datediff(d,Isnull(actualstardate,anticipatedstartdate) ,actualenddate)
end
from t
That is a task for the reporting tool. You will have to write something like he next code in teh Display Properties of the Days field:
if RowNumber > 1 AND id = previous_row(id)
then -- hide the value of Days
Colour = BackgroundColour
Days = NULL
Days = ' '
Display = false
... (anything that works)
So they want the output to be exactly the same except that they don't want to see the days listed multiple times for each ID value? And they're quite happy to see the ID and Description repeatedly but the Days value annoys them?
That's not really an SQL question. SQL is about which rows, columns and derived values are supposed to be presented in what order and that part seems to be working fine.
Suppressing the redundant occurrences of the Days value is more a matter of using the right tool. I'm not up on the current tools but the last time I was, QMF was very good for this kind of thing. If a column was the basis for a control break, you could, in effect, select an option for that column that told it not to repeat the value of the control break repeatedly. That way, you could keep it from repeating ID, Description AND Days if that's what you wanted. But I don't know if people are still using QMF and I have no idea if you are. And unless the price has come way down, you don't want to go out and buy QMF just to suppress those redundant values.
Other tools might do the same kind of thing but I can't tell you which ones. Perhaps the tool you are using to do your reporting - Crystal Reports or whatever - has that feature. Or not. I think it was called Outlining in QMF but it may have a different name in your tool.
Now, if this report is being generated by an application program, that is a different kettle of Fish. An application could handle that quite nicely. But most people use end-user reporting tools to do this kind of thing to avoid the greater cost involved in writing programs.
We might be able to help further if you specify what tool you are using to generate this report.
Related
I was asked to assist with developing a report to retrieve a 25% sample of random transactions within a specific date range. I am not a programmer but I was able to devise the following fairly quickly:
SELECT TOP 25 PERCENT account.CID, account.ACCT, account.NAME, log.DATE, log.action_txt, log.field_nm, log.from_data, log.to_data, log.tran_id, log.init
FROM account INNER JOIN log ON account.ACCT = log.ACCT
GROUP BY account.CID, account.ACCT, account.NAME, log.DATE, log.action_txt, log.field_nm, log.from_data, log.to_data, log.tran_id, log.init
HAVING (((log.DATE) Between #2/7/2018# And #6/15/2018#) AND ((log.action_txt)="mod" Or (log.action_txt)="del") AND ((log.init)="J1X"
ORDER BY log.tran_dt
This returns 25% of the records within the date range. Each record row is unique but each account number potentially has multiple records on each day. In some cases the records have the same date and tran_id as well.
Upon further discussion with the requester, he actually wants to see all of the transactions for 25% of the accounts that have activity on each day within the date range. Thus if there were 100 accounts on 3/1/2018 with records in this table, he wants to see all of the transactions for 25 of those accounts; if there were 60 accounts on 3/2/2018 with records in this table, he wants to see all of the transactions for 15 of those accounts; and so on.
I was thinking that an Access module would work best in this scenario as I believe there are multiple parts to this. I figured that I need a function to loop through the date range and for each day:
1. Count the account numbers only one time
2. Return all of the transactions for 25% of the total accounts
But as I mentioned, I am not a programmer and I am exhausted from searching possible solutions for the many parts.
I think the key to your question is that you only really need a pseudo random selection of results for your report. So you can force the Random number generator to reorder your results based on a value in the record and the current time.
Something like this should work - I assume your actiontxt field is a text field and pull out the length of each field and apply that to current date/time to create a pseudo random number that can be sorted.
All I really do is change your ORDER BY line
See if this works for you
SELECT TOP 25 PERCENT
account.CID, account.ACCT, account.NAME, log.DATE, log.action_txt, log.field_nm, log.from_data,
log.to_data, log.tran_id, log.init
FROM account
INNER JOIN log ON account.ACCT = log.ACCT
GROUP BY account.CID, account.ACCT, account.NAME, log.DATE, log.action_txt, log.field_nm, log.from_data, log.to_data, log.tran_id, log.init
HAVING (((log.DATE) Between #2/7/2018# And #6/15/2018#) AND ((log.action_txt)="mod" Or (log.action_txt)="del") AND ((log.init)="J1X"
ORDER BY Rnd(CLng(Now()*Len(log.action_txt))-(Now()*Len(log.action_txt)));
Modified similar idea from other StackOverflow question and response
I am discovering SQL as I have to build queries in my new company.I have understood the basic but here is where I am stuck, maybe you could help me figure this out :
I would like to mention a product as unprocurable if sellers rejected my orders twice. Tricky part I aggregate the furniture orders for all our local offices, therefore even though I sent my purchase order(s) to one unique seller (the one with the best offer at the moment) I might have multiple lines for each item (one per office)
See below table for purchase orders, see REF1 item should be set as unprocurable as both on 21 and 31 december my orders have been rejected (no matter the seller)
http://i.stack.imgur.com/r3W3E.jpg
So to put it in logic I would like to have something like this:
For each items with 2 latest purchase orders that were both made at different dates and rejected(0 value in the table) THEN attach a note to it saying "unprocurable" else put as procurable.
IF it was only 1 value I think I could go with
Select
item
, MAX(date)
, case
when confirmed_units = 0
then 'Unprocurable'
else 'procurable'
end
From
purchase_table
Where
date between TO_DATE('01/01/2013', 'MM/DD/YYYY') AND TO_DATE('{RUN_DATE_YYYY/MM/DD}', 'YYYY/MM/DD')
But now I need to check the two latest purchase orders and that are not from the same day.
I am a bit lost, could you give a hand please?
Thanks !
Your question is a little unclear... have you tried using something along the lines of:
SELECT TOP 2 etc, etc... order by [column]
I'd like to consult one thing. I have table in DB. It has 2 columns and looks like this:
Name...bilance
Jane...+3
Jane...-5
Jane...0
Jane...-8
Jane...-2
Paul...-1
Paul...2
Paul....9
Paul...1
...
I have to walk through this table and if I find record with different "name" (than was on previous row) I process all rows with the previous "name". (If I step on the first Paul row I process all Jane rows)
The processing goes like this:
Now I work only with Jane records and walk through them one by one. On each record I stop and compare it with all previous Jane rows one by one.
The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs
Summary:
I loop through this table in 3 levels paralelly (nested loops)
1st level = search for changes of "name" column
2nd level = if change was found, get all rows with previous "name" and walk through them
3rd level = on each row stop and walk through all previous rows with current "name"
Can this be solved only using CURSOR and FETCHING, or is there some smoother solution?
My real table has 30 000 rows and 1500 people and If I do the logic in PHP, it takes long minutes and than timeouts. So I would like to rewrite it to MS SQL 2000 (no other DB is allowed). Are cursors fast solution or is it better to use something else?
Thank you for your opinions.
UPDATE:
There are lots of questions about my "summarization". Problem is a little bit more difficult than I explained. I simplified it just to describe my algorithm.
Each row of my table contains much more columns. The most important is month. That's why there are more rows for each person. Each is for different month.
"Bilances" are "working overtimes" and "arrear hours" of workers. And I need to sumarize + and - bilances to neutralize them using values from previous months. I want to have as many zeroes as possible. All the table must stay as it is, just bilances must be changed to zeroes.
Example:
Row (Jane -5) will be summarized with row (Jane +3). Instead of 3 I will get 0 and instead of -5 I will get -2. Because I used this -5 to reduce +3.
Next row (Jane 0) won't be affected
Next row (Jane -8) can not be used, because all previous bilances are negative
etc.
You can sum all the values per name using a single SQL statement:
select
name,
sum(bilance) as bilance_sum
from
my_table
group by
name
order by
name
On the face of it, it sounds like this should do what you want:
select Name, sum(bilance)
from table
group by Name
order by Name
If not, you might need to elaborate on how the Names are sorted and what you mean by "summarize".
I'm not sure what you mean by this line... "The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs".
But, it may be possible to use a group by query to get a lot of what you need.
select name, case when bilance < 0 then 'negative' when bilance >= 0 then 'positive', count(*)
from table
group by name, bilance
That might not be perfect syntax for the case statement, but it should get you really close.
It would seem that there is a much simpler way to state the problem. Please see Edit 2, following the sample table.
I have a number of different products on a production line. I have the date that each product entered production. Each product has two identifiers: item number and serial number I have the total number of labour hours for each product by item number and by serial number (i.e. I can tell you how many hours went into each object that was manufactured and what the average build time is for each kind of object).
I want to determine how (if) varying the length of production runs affects the average time it takes to build a product (item number). A production run is the sequential production of multiple serial numbers for a single item number. We have historical records going back several years with production runs varying in length from 1 to 30.
I think to achieve this, I need to be able to assign 'run id'. To me, that means building a query that sorts by start date and calculates a new unique value at each change in item number. If I knew how to do that, I could solve the rest of the problem on my own.
So that suggests a series of related questions:
Am I thinking about this the right way?
If I am on the right track, how do I generate those run id values? Calculate and store is an option, although I have a (misguided?) preference for direct queries. I know exactly how I would generate the run numbers in Excel, but I have a (misguided?) preference to do this in the database.
If I'm not on the right track, where might I find that track? :)
Edit:
Table structure (simplified) with sample data:
AutoID Item Serial StartDate Hours RunID (proposed calculation)
1 Legend 1234 2010-06-06 10 1
3 Legend 1235 2010-06-07 9 1
2 Legend 1237 2010-06-08 8 1
4 Apex 1236 2010-06-09 12 2
5 Apex 1240 2010-06-10 11 2
6 Legend 1239 2010-06-11 10 3
7 Legend 1238 2010-06-12 8 3
I have shown that start date, serial, and autoID are mutually unrelated. I have shown the expectation that labour goes down as the run length increases (but this is a 'fact' only via received wisdom, not data analysis). I have shown what I envision as the heart of the solution, that being a RunID that reflects sequential builds of a single item. I know that if I could get that runID, I could group by run to get counts, averages, totals, max, min, etc. In addition, I could do something like hours/ to get percentage change from the start of the run. At that point I could graph the trends associated with different run lengths either globally across all items or on a per item basis. (At least I think I could do all that. I might have to muck about a bit, but I think I could get it done.)
Edit 2: This problem would appear to be: how do I get the 'starting' member (earliest start date) of each run when I don't already have a runID? (The runID shown in the sample table does not exist and I was originally suggesting that being able to calculate runID was a potentially viable solution.)
AutoID Item
1 Legend
4 Apex
6 Legend
I'm assuming that having learned how to find the first member of each run that I would then be able to use what I've learned to find the last member of each run and then use those two results to get all other members of each run.
Edit 3: my version of a query that uses the AutoID of the first item in a run as the RunID for all units in a run. This was built entirely from samples and direction provided by Simon, who has the accepted answer. Using this as the basis for grouping by run, I can produce a variety of run statistics.
SELECT first_product_of_run.AutoID AS runID, run_sibling.AutoID AS itemID, run_sibling.Item, run_sibling.Serial, run_sibling.StartDate, run_sibling.Hours
FROM (SELECT first_of_run.AutoID, first_of_run.Item, first_of_run.Serial, first_of_run.StartDate, first_of_run.Hours
FROM dbo.production AS first_of_run LEFT OUTER JOIN
dbo.production AS earlier_in_run ON first_of_run.AutoID - 1 = earlier_in_run.AutoID AND
first_of_run.Item = earlier_in_run.Item
WHERE (earlier_in_run.AutoID IS NULL)) AS first_product_of_run LEFT OUTER JOIN
dbo.production AS run_sibling ON first_product_of_run.Item = run_sibling.Item AND first_product_of_run.AutoID run_sibling.AutoID AND
first_product_of_run.StartDate product_between.Item AND
first_product_of_run.StartDate
Could you describe your table structure some more? If the "date that each product entered production" is a full time stamp, or if there is a sequential identifier across products, you can write queries to identify the first and last products of a run. From that, you can assign IDs to or calculate the length of the runs.
Edit:
Once you've identified 1,4, and 6 as the start of a run, you can use this query to find the other IDs in the run:
select first_product_of_run.AutoID, run_sibling.AutoID
from first_product_of_run
left join production run_sibling on first_product_of_run.Item = run_sibling.Item
and first_product_of_run.AutoID <> run_sibling.AutoID
and first_product_of_run.StartDate < run_sibling.StartDate
left join production product_between on first_product_of_run.Item <> product_between.Item
and first_product_of_run.StartDate < product_between.StartDate
and product_between.StartDate < run_sibling.StartDate
where product_between.AutoID is null
first_product_of_run can be a temp table, table variable, or sub-query that you used to find the start of a run. The key is the where product_between.AutoID is null. That restricts the results to only pairs where no different items were produced between them.
Edit 2, here's how to get the first of each run:
select first_of_run.AutoID
from
(
select product.AutoID, product.Item, MAX(previous_product.StartDate) as PreviousDate
from production product
left join production previous_product on product.AutoID <> previous_product.AutoID
and product.StartDate > previous_product.StartDate
group by product.AutoID, product.Item
) first_of_run
left join production earlier_in_run
on first_of_run.PreviousDate = earlier_in_run.StartDate
and first_of_run.Item = earlier_in_run.Item
where earlier_in_run.AutoID is null
It's not pretty, and will break if StartDate is not unique. The query could be simplified by adding a sequential and unique identifier with no gaps. In fact, that step will probably be necessary if StartDate is not unique. Here's how it would look:
select first_of_run.AutoID
from production first_of_run
left join production earlier_in_run
on (first_of_run.Sequence - 1) = earlier_in_run.Sequence
and first_of_run.Item = earlier_in_run.Item
where earlier_in_run.AutoID is null
Using outer joins to find where things aren't still twists my brain, but it's a very powerful technique.
I have three tables related to this particular query:
Lawson_Employees: LawsonID (pk), LastName, FirstName, AccCode (numeric)
Lawson_DeptInfo: AccCode (pk), AccCode2 (don't ask, HR set up), DisplayName
tblExpirationDates: EmpID (pk), ACLS (date), EP (date), CPR (date), CPR_Imported (date), PALS (date), Note
The goal is to get the data I need to report on all those who have already expired in one or more certification, or are going to expire in the next 90 days.
Some important notes:
This is being run as part of a vbScript, so the 90-day date is being calculated when the script is run. I'm using 2010-08-31 as a placeholder since its the result at the time this question is being posted.
All cards expire at the end of the month. (which is why the above date is for the end of August and not 90 days on the dot)
A valid EP card supersedes ACLS certification, but only the latter is required of some employees. (wasn't going to worry about it until I got this question answered, but if I can get the help I'll take it)
The CPR column contains the expiration date for the last class they took with us. (NULL if they didn't take any classes with us)
The CPR_Imported column contains the expiration date for the last class they took somewhere else. (NULL if they didn't take it elsewhere, and bravo for following policy)
The distinction between CPR classes is important for other reports. For purposes of this report, all we really care about is which one is the most current - or at least is currently current.
If I have to, I'll ignore ACLS and PALS for the time being as it is non-compliance with CPR training that is the big issue at the moment. (not that the others won't be, but they weren't mentioned in the last meeting...)
Here's the query I have so far, which is giving me good data:
SELECT
iEmp.LawsonID, iEmp.LastName, iEmp.FirstName,
dept.AccCode2, dept.DisplayName,
Exp.ACLS, Exp.EP, Exp.CPR, Exp.CPR_Imported, Exp.PALS, Exp.Note
FROM (Lawson_Employees AS iEmp
LEFT JOIN Lawson_DeptInfo AS dept ON dept.AccCode = iEmp.AccCode)
LEFT JOIN tblExpirationDates AS Exp ON iEmp.LawsonID = Exp.EmpID
WHERE iEmp.CurrentEmp = 1
AND ((Exp.ACLS <= #2010-08-31#
AND Exp.ACLS IS NOT NULL)
OR (Exp.CPR <= #2010-08-31#
AND Exp.CPR_Imported <= #2010-08-31#)
OR (Exp.PALS <= #2010-08-31#
AND Exp.PALS IS NOT NULL))
ORDER BY dept.AccCode2, iEmp.LastName, iEmp.FirstName;
After perusing the result set, I think I'm missing some expiration dates that should be in the result set. Am I missing something? This is the sucky part of being the only developer in the department... no one to ask for a little help.
I think the problem is here:
OR (Exp.CPR <= #2010-08-31#
AND Exp.CPR_Imported <= #2010-08-31#)
Null is not less than or greater than anything.
If you need the people who do not have a valid CPR, you will need to include nulls.
It may be easiest to use Nz:
OR (Nz(Exp.CPR,#2010-08-31#) <= #2010-08-31#
AND Nz(Exp.CPR_Imported,#2010-08-31#) <= #2010-08-31#)