Problem to calculate values from 1 table using group by in firebird - sql

I have a logic problem to calculate the final value of this table:
https://i.stack.imgur.com/YPXXX.png
I need calculate for every row with column TIPO having the value "E" +1 and for "S" -1, grouping by columns Codigo and Configuracao.
Basically, I need a simple stock control, the columns Codigo and Configuracao is product column control, and TIPO is the type of moviment, S = OUT and E = IN
Anyone can give me a light?

untested but maybe this
select SUM(t1.TipoNumeric), t1.CODIGO, t1.CONFIGURACAO from (
select
case (TIPO)
when 'E' then 1
when 'S' then -1
else 0
end as TipoNumeric,
CODIGO,
CONFIGURACAO
from MyTable
) as t1
group by t1.CODIGO, t1.CONFIGURACAO

Just add that +1/-1 column, perhaps?
alter table MyTable
add tipo_val computed by
(
decode( upper(TIPO), 'E', +1, 'S', -1 )
)
https://firebirdsql.org/file/documentation/html/en/refdocs/fblangref25/firebird-25-language-reference.html#fblangref25-ddl-tbl
https://www.firebirdsql.org/refdocs/langrefupd21-intfunc-decode.html
And then:
Select * from MyTable;
Select SUM(tipo_val), CODIGO, CONFIGURACAO
From MyTable
Group by 2, 3
P.S. do not use pictures to show your data.
Instead put them to http://dbfiddle.uk/?rdbms=firebird_3.0 as a script,
and then use Markdown Export there to copy both data and a hyperlink into your question text.
P.P.S. i believe your whole approach is wrong there, if "need a simple stock control".
https://en.wikipedia.org/wiki/Double-entry_bookkeeping
https://medium.com/#RobertKhou/double-entry-accounting-in-a-relational-database-2b7838a5d7f8
I think your table should have columns like that:
surrogate row id, primary key, auto-incrementing integer, 32-bits or 64-bits
columns identifying your item, usually it is, again, a single surrogain integer SKU (Stock Keeping Unit) referencing (see - Foreign Keys) another "dictionary table". In your case it seemes to be two columns Codigo and Configuracao but that also means you can not add extra information ("attributes") about your items, like price or photo (read: database normalization). It also makes grouping harder for Firebird Engine, than using a single integer column. Also, you did created an index on the item-identifying column(s) did you not? What is your query plan on those selects, do they use index on Codigo and Configuracao or an ad hoc external sorting instead?
the timestamp of an operation, that is automatically set by the Firebird server to be current_timestamp, so you always know when exactly that row was inserted. Indexed, of course.
the computer user who added that row, again, automatically set by Firebird server to current_user or to an ID of a user in some stock_workers table you would create. Surely, indexed too.
some description of an operation, like contract number, or seller name, anything that would help you later to remember what real world event that row even describes. Being free form text, it probable would not be indexed. But maybe you would eventually make some contracts or sellers table and add integer references (FK IDs) to those tables? That depends which exactly kind of data would be repeated often enough to be worth extracting into an extra indexed columns.
maybe a unit measure, maybe all your units forever would only be measured in pieces, in integer quantity. But maybe there would be some items measured in kilograms, meters, liters, etc?
finaly two integer (or float?) columns like Qty_Income and Qty_Outcome, where you would record how many items were added or taken from your depot. There would be not that E/S column! There would be two integer columns, that you would put number into one or another. Why? read the articles about bookkeeping above!
In such a database scheme your query would finally look like this:
select Sum(s.Qty_Income) as Credit, Sum(s.Qty_Outcome) as Debit,
Sum(s.Qty_Income) - Sum(s.Qty_Outcome) as Saldo,
min(g.Codigo), min(g.Configuracao)
from stock_movements s
join known_goods g on g.ID = s.SKU_ID
group by s.SKU_ID
And you would also be able to flexibly compose similar requests grouping by workers, or dates, or quantities (like, only care about BIG events like 1000 or more items added in one operation), or anything.

Related

How to produce a reproducible column of random integers in SQL

I have a table of patient, with a unique patientID column. This patientID cannot be shared with study teams, so I need a randomised set of unique patient identifiers to be able to share. The struggle is that there will be several study teams, so every time a randomised identifier is produced, it needs to be different to the identifier produced for other studies. To make it even more complicated, we need to be able to reproduce the same set of random identifiers for a study at any point (if the study needs to re-run the data for example).
I have looked into the RAND() and NEWID() functions but not managed to figure out a solution. I think this may be possible using RAND() with a seed, and a while loop, but I haven't used these before.
Can anyone provide a solution that allows me to share several randomised sets of unique identifiers, that never have the same identifier for the same patient, and which can be re-run to produce the same list?
Thanks in advance to anyone that helps with this!
Your NEWID() should work as long as you have correct datatype.
Using UNIQUEIDENTIFIER as datatype should be unique across entire database/server. See full details from link below:
sqlshack.com/understanding-the-guid-data-type-in-sql-server
DECLARE #UNI UNIQUEIDENTIFIER
SET #UNI = NEWID()
SELECT #UNI
Comments from link:
As mentioned earlier, GUID values are unique across tables, databases, and servers. GUIDs can be considered as global primary keys. Local primary keys are used to uniquely identify records within a table. On the other hand, GUIDs can be used to uniquely identify records across tables, databases, and servers.
One method is to use the patientid as a seed to rand():
select rand(checksum(patientid))
This returns a value between 0 and 1. You can multiply by a large number.
That said, I think you should keep a list of patients in each study -- so you don't have to reproduce the results. Reproducing results seems dangerous, especially for something like a "study" that could have an impact on health.
This is too much for a comment. It's not black and white from your description and comments what you are asking for, but it appears you want to associate a new random ID value for each existing patients' ID, presumably being able to tie it back to the source ID, and produce the same random ID at a later date repeatedly.
It sounds like you'll need an intermediary table to store the randomly produced IDs (otherwise, being random how do you guarantee to get the same value for the same PatientID?)
Could you therefore have a table something like
create table Synonyms (
Id int not null identity(1,1),
PatientId int not null,
RandomId uniqueidentifier not null default newid(),
Createdate datetime not null default getdate()
)
PatientId is the foreign key to the actual Id of the Patent.
Each time you need a new random PatientId, insert the PatientIDs into this table and then join to it when querying out the patient data, supplying the RandomId instead. That way, you can reproduce the same random Id each time it's needed.
You could have a view that always provides the most recent RandomId value for each PatientId, or by some mechanism to track which "version" a report gets.
If you need a new Id for the patient, insert its Id again and you are guaranteed to get the same Id via whatever logic you need - ie you could have a ReportNo column as a sequence partitioned by PatientId or any number of other ways.
If you prefer to avoid a GUID you could make it an int and use a function to generate it by checking it's not already used, possibly a computed column with an inline function that selects top 1 from a numbers table that doesn't already exist as a RandomId... or something like that!
I may have completely misunderstood, hopefully it might give you some ideas though.

in sql in a table, in a given column with data type text, how can we show the rest of the entries in that column after a particular entry

in sql, in any given table, in a column named "name", wih data type as text
if there are ten entries, suppose an entry in the column is "rohit". i want to show all the entries in the name column after rohit. and i do not know the row id or id. can it be done??
select * from your_table where name > 'rohit'
but in general you should not treat text columns like that.
a database is more than a collection of tables.
think about how to organize your data, what defines a datarow.
maybe, beside their name, there is another thing how you would classify such a row? some things like "shall be displayed?" "is modified" "is active"?
so if you had a second column, say display of type int and your table looked like
CREATE TABLE MYDATA
NAME TEXT,
DISPLAY INT NOT NULL DEFAULT(1);
you could flag every row with 1 or 0 whether it should be displayed or not and then your query could look like
SELECT * FROM MYDATA WHERE DISPLAY=1 ORDER BY NAME
to get your list of values.
it's not much of a difference with ten rows, you don't even need indexes here, but if you build something bigger, say 10,000+ rows, you'd be surprised how slow that would become!
in general, TEXT columns are good to select and display, but should be avoided as a WHERE condition as much as you can. Use describing columns, preferrably int fields which can be indexed with extreme high efficiency and an application doesn't get slower even if the record size goes over 100k.
You can use "default" keyword for it.
CREATE TABLE Persons (
ID int NOT NULL,
name varchar(255) DEFAULT 'rohit'
);

I need help counting char occurencies in a row with sql (using firebird server)

I have a table where I have these fields:
id(primary key, auto increment)
car registration number
car model
garage id
and 31 fields for each day of the mont for each row.
In these fields I have char of 1 or 2 characters representing car status on that date. I need to make a query to get number of each possibility for that day, field of any day could have values: D, I, R, TA, RZ, BV and LR.
I need to count in each row, amount of each value in that row.
Like how many I , how many D and so on. And this for every row in table.
What best approach would be here? Also maybe there is better way then having field in database table for each day because it makes over 30 fields obviously.
There is a better way. You should structure the data so you have another table, with rows such as:
CarId
Date
Status
Then your query would simply be:
select status, count(*)
from CarStatuses
where date >= #month_start and date < month_end
group by status;
For your data model, this is much harder to deal with. You can do something like this:
select status, count(*)
from ((select status_01 as status
from t
) union all
(select status_02
from t
) union all
. . .
(select status_31
from t
)
) s
group by status;
You seem to have to start with most basic tutorials about relational databases and SQL design. Some classic works like "Martin Gruber - Understanding SQL" may help. Or others. ATM you miss the basics.
Few hints.
Documents that you print for user or receive from user do not represent your internal data structures. They are created/parsed for that very purpose machine-to-human interface. Inside your program should structure the data for easy of storing/processing.
You have to add a "dictionary table" for the statuses.
ID / abbreviation / human-readable description
You may have a "business rule" that from "R" status you can transition to either "D" status or to "BV" status, but not to any other. In other words you better draft the possible status transitions "directed graph". You would keep it in extra columns of that dictionary table or in one more specialized helper table. Dictionary of transitions for the dictionary of possible statuses.
Your paper blank combines in the same row both totals and per-day detailisation. That is easy for human to look upon, but for computer that in a sense violates single responsibility principle. Row should either be responsible for primary record or for derived total calculation. You better have two tables - one for primary day by day records and another for per-month total summing up.
Bonus point would be that when you would change values in the primary data table you may ask server to automatically recalculate the corresponding month totals. Read about SQL triggers.
Also your triggers may check if the new state properly transits from the previous day state, as described in the "business rules". They would also maybe have to check there is not gaps between day. If there is a record for "march 03" and there is inserted a new the record for "march 05" then a record for "march 04" should exists, or the server would prohibit adding such a row. Well, maybe not, that is dependent upon you business processes. The general idea is that server should reject storing any data that is not valid and server can know it.
you per-date and per-month tables should have proper UNIQUE CONSTRAINTs prohibiting entering duplicate rows. It also means the former should have DATE-type column and the latter should either have month and year INTEGER-type columns or have a DATE-type column with the day part in it always being "1" - you would want a CHECK CONSTRAINT for it.
If your company has some registry of cars (and probably it does, it is not looking like those car were driven in by random one-time customers driving by) you have to introduce a dictionary table of cars. Integer ID (PK), registration plate, engine factory number, vagon factory number, colour and whatever else.
The per-month totals table would not have many columns per every status. It would instead have a special row for every status! The structure would probably be like that: Month / Year / ID of car in the registry / ID of status in the dictionary / count. All columns would be integer type (some may be SmallInt or BigInt, but that is minor nuancing). All the columns together (without count column) should constitute a UNIQUE CONSTRAINT or even better a "compound" Primary Key. Adding a special dedicated PK column here in the totaling table seems redundant to me.
Consequently, your per-day and per-month tables would not have literal (textual and immediate) data for status and car id. Instead they would have integer IDs referencing proper records in the corresponding cars dictionary and status dictionary tables. That you would code as FOREIGN KEY.
Remember the rule of thumb: it is easy to add/delete a row to any table but quite hard to add/delete a column.
With design like yours, column-oriented, what would happen if next year the boss would introduce some more statuses? you would have to redesign the table, the program in many points and so on.
With the rows-oriented design you would just have to add one row in the statuses dictionary and maybe few rows to transition rules dictionary, and the rest works without any change.
That way you would not

SQL field as sum of other fields

This is not query related, what I would like to know is if it's possible to have a field in a column being displayed as a sum of other fields. A bit like Excel does.
As an example, I have two tables:
Recipes
nrecepie integer
name varchar(255)
time integer
and the other
Instructions
nintrucion integer
nrecepie integer
time integer
So, basically as a recipe has n instructions I would like that
recipes.time = sum(intructions.time)
Is this possible to be done in create table script?? if so, how?
You can use a view:
CREATE VIEW recipes_with_time AS
SELECT nrecepie, name, SUM(Instructions.time) AS total_time
FROM Recepies
JOIN Instructions USING (nrecepie)
GROUP BY Recepies.nrecepie
If you really want to have that data in the real table, you must use a trigger.
This could be done with an INSERT/UPDATE/DELETE trigger. Every time data is changed in table Instructions, the trigger would run and update the time value in Recepies.
You can use a trigger to update the time column everytime the instructions table is changed, but a more "normal" (less redundant) way would be to compute the time column via a group by clause on a join between the instructions and recepies [sic] table.
In general, you want to avoid situations like that because you're storing derived information (there are exceptions for performance reasons). Therefore, the best solution is to create a view as suggested by AndreKR. This provides an always-correct total that is as easy to SELECT from the database as if it were in an actual, stored column.
Depends pn the database vendor... In SQL Server for example, you can create a column that calculates it's value based on the values of other columns in the same row. they are called calculated columns, and you do it like this:
Create Table MyTable
(
colA Integer,
colB Integer,
colC Intgeer,
SumABC As colA + colB + colC
)
In general just put the column name you want, the word 'as' and the formula or equation to ghenerate the value. This approach uses no aditonal storage, it calculates thevalue each time someone executes a select aganist it, so the table profile remains narrower, and you get better performance. The only downsode is you cannot put an index on a calculated column. (although there is a flag in SQL server that allows you to specify to the database that it should persist the value whenever it is created or updated... In which case it can be indexed)
In your example, however, you are accessing data from multiple rows in another table. To do this, you need a trigger as suggested by other respondants.

How are these tasks done in SQL?

I have a table, and there is no column which stores a field of when the record/row was added. How can I get the latest entry into this table? There would be two cases in this:
Loop through entire table and get the largest ID, if a numeric ID is being used as the identifier. But this would be very inefficient for a large table.
If a random string is being used as the identifier (which is probably very, very bad practise), then this would require more thinking (I personally have no idea other than my first point above).
If I have one field in each row of my table which is numeric, and I want to add it up to get a total (so row 1 has a field which is 3, row 2 has a field which is 7, I want to add all these up and return the total), how would this be done?
Thanks
1) If the id is incremental, "select max(id) as latest from mytable". If a random string was used, there should still be an incremental numeric primary key in addition. Add it. There is no reason not to have one, and databases are optimized to use such a primary key for relations.
2) "select sum(mynumfield) as total from mytable"
for the last thing use a SUM()
SELECT SUM(OrderPrice) AS OrderTotal FROM Orders
assuming they are all in the same column.
Your first question is a bit unclear, but if you want to know when a row was inserted (or updated), then the only way is to record the time when the insert/update occurs. Typically, you use a DEFAULT constraint for inserts and a trigger for updates.
If you want to know the maximum value (which may not necessarily be the last inserted row) then use MAX, as others have said:
SELECT MAX(SomeColumn) FROM dbo.SomeTable
If the column is indexed, MSSQL does not need to read the whole table to answer this query.
For the second question, just do this:
SELECT SUM(SomeColumn) FROM dbo.SomeTable
You might want to look into some SQL books and tutorials to pick up the basic syntax.