I need help counting char occurencies in a row with sql (using firebird server) - sql

I have a table where I have these fields:
id(primary key, auto increment)
car registration number
car model
garage id
and 31 fields for each day of the mont for each row.
In these fields I have char of 1 or 2 characters representing car status on that date. I need to make a query to get number of each possibility for that day, field of any day could have values: D, I, R, TA, RZ, BV and LR.
I need to count in each row, amount of each value in that row.
Like how many I , how many D and so on. And this for every row in table.
What best approach would be here? Also maybe there is better way then having field in database table for each day because it makes over 30 fields obviously.

There is a better way. You should structure the data so you have another table, with rows such as:
CarId
Date
Status
Then your query would simply be:
select status, count(*)
from CarStatuses
where date >= #month_start and date < month_end
group by status;
For your data model, this is much harder to deal with. You can do something like this:
select status, count(*)
from ((select status_01 as status
from t
) union all
(select status_02
from t
) union all
. . .
(select status_31
from t
)
) s
group by status;

You seem to have to start with most basic tutorials about relational databases and SQL design. Some classic works like "Martin Gruber - Understanding SQL" may help. Or others. ATM you miss the basics.
Few hints.
Documents that you print for user or receive from user do not represent your internal data structures. They are created/parsed for that very purpose machine-to-human interface. Inside your program should structure the data for easy of storing/processing.
You have to add a "dictionary table" for the statuses.
ID / abbreviation / human-readable description
You may have a "business rule" that from "R" status you can transition to either "D" status or to "BV" status, but not to any other. In other words you better draft the possible status transitions "directed graph". You would keep it in extra columns of that dictionary table or in one more specialized helper table. Dictionary of transitions for the dictionary of possible statuses.
Your paper blank combines in the same row both totals and per-day detailisation. That is easy for human to look upon, but for computer that in a sense violates single responsibility principle. Row should either be responsible for primary record or for derived total calculation. You better have two tables - one for primary day by day records and another for per-month total summing up.
Bonus point would be that when you would change values in the primary data table you may ask server to automatically recalculate the corresponding month totals. Read about SQL triggers.
Also your triggers may check if the new state properly transits from the previous day state, as described in the "business rules". They would also maybe have to check there is not gaps between day. If there is a record for "march 03" and there is inserted a new the record for "march 05" then a record for "march 04" should exists, or the server would prohibit adding such a row. Well, maybe not, that is dependent upon you business processes. The general idea is that server should reject storing any data that is not valid and server can know it.
you per-date and per-month tables should have proper UNIQUE CONSTRAINTs prohibiting entering duplicate rows. It also means the former should have DATE-type column and the latter should either have month and year INTEGER-type columns or have a DATE-type column with the day part in it always being "1" - you would want a CHECK CONSTRAINT for it.
If your company has some registry of cars (and probably it does, it is not looking like those car were driven in by random one-time customers driving by) you have to introduce a dictionary table of cars. Integer ID (PK), registration plate, engine factory number, vagon factory number, colour and whatever else.
The per-month totals table would not have many columns per every status. It would instead have a special row for every status! The structure would probably be like that: Month / Year / ID of car in the registry / ID of status in the dictionary / count. All columns would be integer type (some may be SmallInt or BigInt, but that is minor nuancing). All the columns together (without count column) should constitute a UNIQUE CONSTRAINT or even better a "compound" Primary Key. Adding a special dedicated PK column here in the totaling table seems redundant to me.
Consequently, your per-day and per-month tables would not have literal (textual and immediate) data for status and car id. Instead they would have integer IDs referencing proper records in the corresponding cars dictionary and status dictionary tables. That you would code as FOREIGN KEY.
Remember the rule of thumb: it is easy to add/delete a row to any table but quite hard to add/delete a column.
With design like yours, column-oriented, what would happen if next year the boss would introduce some more statuses? you would have to redesign the table, the program in many points and so on.
With the rows-oriented design you would just have to add one row in the statuses dictionary and maybe few rows to transition rules dictionary, and the rest works without any change.
That way you would not

Related

Problem to calculate values from 1 table using group by in firebird

I have a logic problem to calculate the final value of this table:
https://i.stack.imgur.com/YPXXX.png
I need calculate for every row with column TIPO having the value "E" +1 and for "S" -1, grouping by columns Codigo and Configuracao.
Basically, I need a simple stock control, the columns Codigo and Configuracao is product column control, and TIPO is the type of moviment, S = OUT and E = IN
Anyone can give me a light?
untested but maybe this
select SUM(t1.TipoNumeric), t1.CODIGO, t1.CONFIGURACAO from (
select
case (TIPO)
when 'E' then 1
when 'S' then -1
else 0
end as TipoNumeric,
CODIGO,
CONFIGURACAO
from MyTable
) as t1
group by t1.CODIGO, t1.CONFIGURACAO
Just add that +1/-1 column, perhaps?
alter table MyTable
add tipo_val computed by
(
decode( upper(TIPO), 'E', +1, 'S', -1 )
)
https://firebirdsql.org/file/documentation/html/en/refdocs/fblangref25/firebird-25-language-reference.html#fblangref25-ddl-tbl
https://www.firebirdsql.org/refdocs/langrefupd21-intfunc-decode.html
And then:
Select * from MyTable;
Select SUM(tipo_val), CODIGO, CONFIGURACAO
From MyTable
Group by 2, 3
P.S. do not use pictures to show your data.
Instead put them to http://dbfiddle.uk/?rdbms=firebird_3.0 as a script,
and then use Markdown Export there to copy both data and a hyperlink into your question text.
P.P.S. i believe your whole approach is wrong there, if "need a simple stock control".
https://en.wikipedia.org/wiki/Double-entry_bookkeeping
https://medium.com/#RobertKhou/double-entry-accounting-in-a-relational-database-2b7838a5d7f8
I think your table should have columns like that:
surrogate row id, primary key, auto-incrementing integer, 32-bits or 64-bits
columns identifying your item, usually it is, again, a single surrogain integer SKU (Stock Keeping Unit) referencing (see - Foreign Keys) another "dictionary table". In your case it seemes to be two columns Codigo and Configuracao but that also means you can not add extra information ("attributes") about your items, like price or photo (read: database normalization). It also makes grouping harder for Firebird Engine, than using a single integer column. Also, you did created an index on the item-identifying column(s) did you not? What is your query plan on those selects, do they use index on Codigo and Configuracao or an ad hoc external sorting instead?
the timestamp of an operation, that is automatically set by the Firebird server to be current_timestamp, so you always know when exactly that row was inserted. Indexed, of course.
the computer user who added that row, again, automatically set by Firebird server to current_user or to an ID of a user in some stock_workers table you would create. Surely, indexed too.
some description of an operation, like contract number, or seller name, anything that would help you later to remember what real world event that row even describes. Being free form text, it probable would not be indexed. But maybe you would eventually make some contracts or sellers table and add integer references (FK IDs) to those tables? That depends which exactly kind of data would be repeated often enough to be worth extracting into an extra indexed columns.
maybe a unit measure, maybe all your units forever would only be measured in pieces, in integer quantity. But maybe there would be some items measured in kilograms, meters, liters, etc?
finaly two integer (or float?) columns like Qty_Income and Qty_Outcome, where you would record how many items were added or taken from your depot. There would be not that E/S column! There would be two integer columns, that you would put number into one or another. Why? read the articles about bookkeeping above!
In such a database scheme your query would finally look like this:
select Sum(s.Qty_Income) as Credit, Sum(s.Qty_Outcome) as Debit,
Sum(s.Qty_Income) - Sum(s.Qty_Outcome) as Saldo,
min(g.Codigo), min(g.Configuracao)
from stock_movements s
join known_goods g on g.ID = s.SKU_ID
group by s.SKU_ID
And you would also be able to flexibly compose similar requests grouping by workers, or dates, or quantities (like, only care about BIG events like 1000 or more items added in one operation), or anything.

How to add a column for each day in sql?

I'm trying to make a attendance management system for my college project.
I'm planning to createaone table for each month.
Each table will have
OCT(Roll_no int ,Name varchar, (dates...) bool)
Here dates will be from 1 to 30 and store boolean for present or absent.
Is this a good way to do it?
Is there a way to dynamically add a column for each day when the data was filled.
Also, how can I populate data according to current day.
Edit : I'm planning to make a UI which will have only two options (Present, absent) corresponding to each fetched roll no.
So, roll nos. and names are already going to be in the table. I'll just add status (present or absent) corresponding to each row in table for each date.
I would use Firebase. Make a node with a list of users. Then inside the uses make a attendance node with time-stamps for attended days. That way it's easier to parse. You also would leave room for the ability to bind data from other tables to users as well as the ability to add additional properties to each user.
Or do the SQL equivalent which would be make a table list of users (names and user properties) with associated keys (Primary keys in the user table with Foreign keys in the attendance table) that contained an attendance column that would hold an array of time-stamps representing attended days.
Either way, your UI would then only have to process timestamps and be able to parse through them with dates.
Though maybe add additional columns as years go so it wouldnt be so much of a bulk download.
Edit: In your case you'd want the SQL columns to be by month letting you select whichever month you'd like. For your UI, on injecting new attendance you'd simply add a column to the table if it does not already exist and then continue with the submission. On search/view you'd handle null results (say there were 2 months where no one attended at all. You'd catch any exceptions and continue with your display.)
Ex:
User
Primary Key - Name
1 - Joe
2 - Don
3 - Rob
Attendance
Foreign Key - Dates Array (Oct 2017)
1 - 1508198400, 1508284800, 1508371200
2 - 1508284800
3 - 1508198400, 1508371200
I'd agree with Gordon. This is not a good way to store the data. (It might be a good way to present it). If you have a table with the following columns, you will be able to store the data you want:
role_no (int)
Name (varchar)
Date (Date)
Present (bool)
If you want to then pull out the data for a particular month, you could just add this into your WHERE clause:
WHERE DATEPART(mm, [Date]) = 10 -- for October, or pass in a parameter
Dynamically adding columns is going to be a pain in the neck and is also quite messy

How to create a custom primary key using strings and date

I have an order table in sql server and I need for the order number primary key to be like this
OR\20160202\01
OR is just a string
20160202 is the Date
01 is sequence number for that day
for second Order record the same day it would be
OR\20160202\02 and so on..
backlashes should also be included...
Whats the way to go about creating such a field in sql server (using version 2016)
EDIT: to add more context to what sequence number is, its just a way for this field composite or not to be unique. without a sequence number i would get duplicate records in DB because i could have many records the same day so date would remain the same thus it would be something like
OR\20160202 for all rows for that particular day so it would be duplicate. Adding a "sequence" number helps solve this.
The best way is to not create such a column in SQL. You're effectively combining multiple pieces of data into the same column, which shouldn't happen in a relational database for many reasons. A column should hold one piece of data.
Instead, create a composite primary key across all of the necessary columns.
composite pk
order varchar(20)
orDate DateTime
select *
, row_number() over (partition by cast(orDate as Date) order by orDate) as seq
from table
Will leave it to you on how to concatenate the data
That is presentation thing - don't make it a problem for the PK
About "sequence number for that day" (department, year, country, ...).
Almost every time I discussed such a requirement with end users it turned out to be just misunderstanding of how shared database works, a vague attempt to repeat old (separate databases, EXCEL files or even paper work) tricks on shared database.
So i second Tom H and others, first try not to do it.
If nevertheless you must do it, for legal or other unnegotiatable reasons then i hope you are on 2012+. Create SEQUENCE for every day.
Formatted PK is not a good idea.Composite key is a better approach.The combination of day as a date column and order number as a bigint column should be used.This helps in improving the query performance too.
You might want to explore 'Date Dimension' table. Date Dimension is commonly used table in data warehousing. It stores all the days of the calendar(based on your choice of years) and numeric generated keys for these days. Check this post on date dimension. It talks about creating one in SQL SERVER.
https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server/

How to force ID column to remain sequential even if a recored has been deleted, in SQL server?

I don't know what is the best wording of the question, but I have a table that has 2 columns: ID and NAME.
when I delete a record from the table the related ID field deleted with it and then the sequence spoils.
take this example:
if I deleted row number 2, the sequence of ID column will be: 1,3,4
How to make it: 1,2,3
ID's are meant to be unique for a reason. Consider this scenario:
**Customers**
id value
1 John
2 Jackie
**Accounts**
id customer_id balance
1 1 $500
2 2 $1000
In the case of a relational database, say you were to delete "John" from the database. Now Jackie would take on the customer_id of 1. When Jackie goes in to check here balance, she will now show $500 short.
Granted, you could go through and update all of her other records, but A) this would be a massive pain in the ass. B) It would be very easy to make mistakes, especially in a large database.
Ids (primary keys in this case) are meant to be the rock that holds your relational database together, and you should always be able to rely on that value regardless of the table.
As JohnFx pointed out, should you want a value that shows the order of the user, consider using a built in function when querying.
In SQL Server identity columns are not guaranteed to be sequential. You can use the ROW_NUMBER function to generate a sequential list of ids when you query the data from the database:
SELECT
ROW_NUMBER() OVER (ORDER BY Id) AS SequentialId,
Id As UniqueId,
Name
FROM dbo.Details
If you want sequential numbers don't store them in the database. That is just a maintenance nightmare, and I really can't think of a very good reason you'd even want to bother.
Just generate them dynamically using tSQL's RowNumber function when you query the data.
The whole point of an Identity column is creating a reliable identifier that you can count on pointing to that row in the DB. If you shift them around you undermine the main reason you WANT an ID.
In a real world example, how would you feel if the IRS wanted to change your SSN every week so they could keep the Social Security Numbers sequential after people died off?

How to create history fact table?

I have some entities in my Data Warehouse:
Person - with attributes personId, dateFrom, dateTo, and others those can be changed, e.g. last name, birth date and so on - slowly changing dimension
Document - documentId, number, type
Address - addressId, city, street, house, flat
The relations between (Person and Document) is One-To-Many and (Person and Address) is Many-To-Many.
My target is to create history fact table that can answer us following questions:
What persons with what documents lived at defined address on defined date?
2, What history of residents does defined address have on defined interval of time?
This is not only for what DW is designed, but I think it is the hardest thing in DW's design.
For example, Miss Brown with personId=1, documents with documentId=1 and documentId=2 had been lived at address with addressId=1 since 01/01/2005 to 02/02/2010 and then moved to addressId=2 where has been lived since 02/03/2010 to current date (NULL?). But she had changed last name to Mrs Green since 04/05/2006 and her first document with documentId=1 to documentId=3 since 06/07/2007. Mr Black with personId=2, documentId=4 has been lived at addressId=1 since 02/03/2010 to current date.
The expected result on our query for question 2 where addressId=1, and time interval is since 01/01/2000 to now, must be like:
Rows:
last_name="Brown", documentId=1, dateFrom=01/01/2005, dateTo=04/04/2006
last_name="Brown", documentId=2, dateFrom=01/01/2005, dateTo=04/04/2006
last_name="Green", documentId=1, dateFrom=04/05/2006, dateTo=06/06/2007
last_name="Green", documentId=2, dateFrom=04/05/2006, dateTo=06/06/2007
last_name="Green", documentId=2, dateFrom=06/07/2007, dateTo=02/01/2010
last_name="Green", documentId=3, dateFrom=06/07/2007, dateTo=02/01/2010
last_name="Black", documentId=4, dateFrom=02/03/2010, dateTo=NULL
I had an idea to create fact table with composite key (personId, documentId, addressId, dateFrom) but I have no idea how to load this table and then get that expected result with this structure.
I will be pleased for any help!
Interesting question #Argnist!
So to create some common language for my example, you want a
DimPerson (PK=kcPerson, suggorate key for unique Persons=kPerson, type 2 dim)
DimDocument (PK=kcDocument, suggorate key for unique Documents=kDocument, type 2 dim)
DimAddress (PK=kcAddress, suggorate key for unique Addresses=kAddress, type 2 dim)
A colleague has written a short blog on the usage of two surrogate keys to explain the above dims 'Using Two Surrogate Keys on Dimensions'.
I would always add
DimDate with PK in the form yyyymmdd
to any data warehouse with extra attribute columns.
Then you would have your fact table as
FactHistory (FKs=kcPerson, kPerson, kcDocument, kDocument, kcPerson, kPerson, kDate)
plus any aditional measures.
Then joining on the "kc"s you can show the current Person/Document/Address dimension information.
If you joined on the "k"s you can show the historic Person/Document/Address dimension information.
The downside of this is that this fact table needs one row for each person/document/address/date combination. But it really is a very narrow table, since the table just has a number of foreign keys.
The advantage of this is it is very easy to query for the sorts of questions you were asking.
Alternatively, you could have your fact table as
FactHistory (FKs=kcPerson, kPerson, kcDocument, kDocument, kcPerson, kPerson, kDateFrom, kDateTo)
plus any aditional measures.
This is obviously much more compact, but the querying becomes more complex. You could also put a view over the Fact table to make it easier to query!
The choice of solution depends on the frequency of change of the data. I suspect that it will not be changing that quickly, so teh alternate design of the fact table may be better.
Hope that helps.