SQL - adding category to string value (mapping table)

SQL - adding category to string value (mapping table) - sql

I'm trying to reproduce a mapping that I previously got from an external excel file into a SQL query.
I have specific errors as string (ex. "aborted", "timeout"). A simplified example:
count last_error
452 user_aborted
889 timeout
212 request_denied
98 blacklisted_by_admin
789 login_unsuccessful
340 country_not_available
I would like to map these into categories I have defined, so that the result would be a new column with the error category:
count last_error error_category
452 user_aborted user
889 timeout tech
212 request_denied risk
98 blacklisted_by_admin risk
789 login_bad user
340 country_not_available tech
What is the best way of doing this? I have about 40 errors, and six categories.

You can do case statement like this
case
when last_error in ('user_aborted', 'login_bad') then 'user'
when last_error in ('request_denied', 'blacklisted_by_admin') then 'risk'
when last_error in ('timeout', 'country_not_available') then 'tech'
end as error_category

Related

Second highest column

I have seen a similar question asked How to get second highest value among multiple columns in SQL ... however the solution won't work for Microsoft Access (Row_Number/Over Partition isn't valid in Access).
My Access query includes dozens of fields. I would like to create a new field/column that would return the second highest value of 10 specific columns that are included in the query, I will call this field "Cover". Something like this:
Product Bid1 Bid2 Bid3 Bid4 Cover
Watch 104 120 115 108 115
Shoe 65 78 79 76 18
Hat 20 22 19 20 20
I can do a really long SWITCH formula such as the following equivalent Excel formula:
IF( AND(Bid1> Bid2, Bid1 > Bid3, Bid1 > Bid4), Bid1,
AND(Bid2> Bid1, Bid2 > Bid3, Bid2 > Bid4), Bid2,
.....
But there must be a more efficient solution. A MAXIF equivalent would work perfectly if MS-Access Query had such a function.
Any ideas? Thank you in advance.

This would be easier if the data were laid out in a more normalized way. The clue is the numbered field names.
Your data is currently organized as a Pivot (known in Access as crosstab), but can easily be Unpivoted.
This data is much easier to work with if laid in a more normalized fashion which is this case would be:
Product Bid Amount
--------- ----- --------
Watch 1 104
Watch 2 120
Watch 3 115
Watch 4 108
Shoe 1 65
Shoe 2 78
Shoe 3 79
Shoe 4 76
Hat 1 20
Hat 2 22
Hat 3 19
Hat 4 20
This way querying becomes simpler.
It looks like you want the maximum of the bids, grouped by Product, so:
select Product, max(amount) as maxAmount
from myTable
group by product
Really, we shouldn't be storing text fields at all, so Product should be an ID number, with associated Product Names stored once in a separate table, instead of several times in the this one, like:
ProdID ProdName
-------- ----------
1 Watch
2 Shoe
3 Hat
... but that's another lesson.
Generally speaking repeating of anything should be avoided... that's pretty much the purpose of a database... but the links below will explain than I. :)
Quackit : Microsoft Access Tutorial
YouTube : DB Planning
Microsoft : Database Design Basics
Microsoft : Database Normalization Basics
Wikipedia : Database Normalization

SQL update for multiple database entries

we are using a commercial application, which is customizable. Front End is a webserver with MS SQL server in background.
We have an asset management, which with we can link some contracts.
Now I have to create a new workflow: An asset has a costcenter and all linked contracts to this object have to migrate the costcenter dialy night automatically.
For example that's my view "View_Info" to get needed information:
IDAsset - IDContract - ConstCenterAsset
111 222 333
111 223 333
112 224 334
113 225 335
....
And my main table "Contract":
ID - CostCenter
222 000
223 000
224 000
225 000
I know how to update one entry in "Contract" table with SQL UPDATE command...
But how can I do it for all existing entries...
I have to update about 1000 DB entries dialy night...

You can UPDATE with JOIN like this:
UPDATE c
SET c.CostCenter = v.ConstCenterAsset
FROM Contract as c
INNER JOIN View_Info as v ON v.IDContract = c.ID;
This way, all the table Contract' entries will be updated from the view View_Info. You can also add extra WHERE clause at the end to limit the entries which should be updated.

Google Cloud datalab error querying BIgQuery tables

I think I am missing something basic here, can't seem to figure out what it is..
Querying BigQuery date partitioned table from Google cloud datalab. Most of the other queries fetches data as expected, not sure why in this particular table, select would not work, however count(1) query works.
%%sql
select * from Mydataset.sample_sales_yearly_part limit 10
I get below error:
KeyErrorTraceback (most recent call last) /usr/local/lib/python2.7/dist-packages/IPython/core/formatters.pyc in
__call__(self, obj)
305 pass
306 else:
--> 307 return printer(obj)
308 # Finally look for special method names
309 method = get_real_method(obj, self.print_method)
/usr/local/lib/python2.7/dist-packages/datalab/bigquery/commands/_bigquery.pyc in _repr_html_query_results_table(results)
999 1000 def _repr_html_query_results_table(results):
-> 1001 return _table_viewer(results) 1002 1003
/usr/local/lib/python2.7/dist-packages/datalab/bigquery/commands/_bigquery.pyc in _table_viewer(table, rows_per_page, fields)
969 meta_time = ''
970
--> 971 data, total_count = datalab.utils.commands.get_data(table, fields, first_row=0, count=rows_per_page)
972
973 if total_count < 0:
/usr/local/lib/python2.7/dist-packages/datalab/utils/commands/_utils.pyc in get_data(source, fields, env, first_row, count, schema)
226 return _get_data_from_table(source.results(), fields, first_row, count, schema)
227 elif isinstance(source, datalab.bigquery.Table):
--> 228 return _get_data_from_table(source, fields, first_row, count, schema)
229 else:
230 raise Exception("Cannot chart %s; unsupported object type" % source)
/usr/local/lib/python2.7/dist-packages/datalab/utils/commands/_utils.pyc in _get_data_from_table(source, fields, first_row, count, schema)
174 gen = source.range(first_row, count) if count >= 0 else source
175 rows = [{'c': [{'v': row[c]} if c in row else {} for c in fields]} for row in gen]
--> 176 return {'cols': _get_cols(fields, schema), 'rows': rows}, source.length
177
178
/usr/local/lib/python2.7/dist-packages/datalab/utils/commands/_utils.pyc in _get_cols(fields, schema)
108 if schema:
109 f = schema[col]
--> 110 cols.append({'id': f.name, 'label': f.name, 'type': typemap[f.data_type]})
111 else:
112 # This will only happen if we had no rows to infer a schema from, so the type
KeyError: u'DATE'
QueryResultsTable job_Ckq91E5HuI8GAMPteXKeHYWMwMo

You may be hitting an issue that was just fixed in https://github.com/googledatalab/pydatalab/pull/68 (but not yet included in a Datalab release).
The background is that the new "Standard SQL" support in BigQuery added new datatypes that can show up in the results schema, and Datalab was not yet updated to handle those.
The next release of Datalab should fix this, but in the mean time you can work around it by wrapping your date fields in an explicit cast to TIMESTAMP as part of your query.
For example, if you see that error with the following code cell:
%%sql SELECT COUNT(*) as count, d FROM <mytable>
(where 'd' is a field of type 'DATE'), then you can work around the issue by casting that field to a TIMESTAMP like this:
%%sql SELECT COUNT(*) as count, TIMESTAMP(d) FROM <mytable>
For your particular query, you'll have to change '*' to the list of fields, so that you can cast the one with a date to a timestamp.

Maximum Value from multiple tables

I am a high school math teacher and my school's "data specialist." I am self-taught in Microsft Exel and Access, and I have been recently learning some of the SQL query language behind my usual Access work. I am comfortable with using Access queries to tie together data from many sources, such as exam scores from one source, English proficiency from a second source, and home phone number from a third source.
Here is a situation that I do not know how to do in Microsoft Access.
My math students take the New York state examination up to 3 times a year. They need a score of 80 to be considered "college ready."
Here are 3 sample tables. Each table uses the unique primary key "StudentID." The Integrated Algebra exam has the code MXRE.
Table #1 name: JanuaryAlgebra
StudentID Course Mark
201 MXRE 90
202 MXRE 55
203 MXRE 67
204 MXRE 80
205 MXRE 78
Note: Student #201 and #204 have finished the exam and do not take it again.
Table #2 name: JuneAlgebra
StudentID Course Mark
202 MXRE 70
203 MXRE 76
205 MXRE 81
206 MXRE 86
207 MXRE 78
There are two new students to the school, #206 and #207. Students #205 and #206 have finished the exam with high scores, and the remaining three students try the exam a third time.
Table #3 name: AugustAlgebra
StudentID Course Mark
202 MXRE 72
203 MXRE 83
207 MXRE 93
How do I return a query with one line for each StudentID displaying their highest exam score after the end of the school year???
Thanks!
Jeff

I'm not as familiar with Access, but I think it supports T-SQL. If it does then you can select all the rows in one statement and get the max. Though I realized when writing this answer that it's probably easier with a sub-select
In SQL it would look something like:
SELECT StudentId, Course, Max(Mark)
FROM (
SELECT StudentId, Course, Mark FROM JanuaryAlgebra
UNION
SELECT StudentId, Course, Mark FROM JuneAlgebra
UNION
SELECT StudentId, Course, Mark FROM AugustAlgebra
) as NewTable
GROUP BY StudentId, Course

I would suggest altering the table structure:
YourTable (Student_ID,Course,Mark,Date)
Then you can simply query:
SELECT Student_ID,Course,MAX(Mark) AS Max_Mark
FROM YourTable
--WHERE Course = 'MXRE' --If you wanted only algebra results.
GROUP BY Student_ID,Course
Multiple tables of identical structure almost never makes sense.
You can however use your current format to do this by unioning together all your tables in a subquery.

Database design for a step by step wizard

I am designing a system containing logical steps with some actions associated (but the actions are not part of the question, but they are crucial for each step in the list)!
The ting is that I need to create a way to define all the logical steps in an ordered way, so that I can get the list by query, and also make modifications later on!
Anyone with some experience in this kind of database design?
I have been thinking of having a column named wizard_steps (or something similar), and then use priority to make the order, but for some reason i feel that this design at some point will fail (due to items with same priority, adding new items would then have to rearrange the rest of the items, and so forth)!
Another design I have been thinking about is the use of "next item" as a column in the wizard_step column, but I don't feel this is the correct step eighter!
So to summarize; I am trying to make a list (and the design should be open enought to support multiple lists) of elements where the order is crucial!
Any ideas on how the database should look like?
Thanks!
EDIT: I found this yii component I will check out: http://www.yiiframework.com/extension/simpleworkflow/
Might be a good solution!

If I get you well, your main concern is to create a schema that supports ordered lists and can provide easy insert/reordering of items.
The following table design:
id_list item_priority foreign_itemdef_id
1 1 245
1 2 32
1 3 45
2 1 156
2 2 248
2 3 127
coupled to a table with item definition will be easily queried but will be difficult to maintain, especially for insertions
That one:
id_list first_item_id
1 45
2 38
coupled to the linked list:
item_id next_item foreign_itemdef_id
45 381 56
381 NULL 59
38 39 89
39 42 78
42 NULL 45
Will be both difficult to query and update (you should update the linked list inside a transaction, otherwise your linked list can get corrupted).
I would prefer the first solution for simplicity.
Depending on your update frequency, you may consider using large increments between item_priority to help insertion:
id_list item_priority foreign_itemdef_id
1 1000 245
1 2000 32
1 3000 45
2 1000 156
2 2000 248
2 3000 127
1 2500 46 -- late insertion
1 2750 47 -- late insertion
EDIT:
Here's a query that will hopefully make room for an insertion: it increments priority of all rows above the argument
$query_make_room_for_new_item = "UPDATE item_priority_table SET item_priority = item_priority + 1 WHERE item_priority > ". $new_item_position_priority ." AND id_list = ".$id_list;
Then insert your item with priority $new_item_position_priority

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL - adding category to string value (mapping table) - sql

You can do case statement like this case when last_error in ('user_aborted', 'login_bad') then 'user' when last_error in ('request_denied', 'blacklisted_by_admin') then 'risk' when last_error in ('timeout', 'country_not_available') then 'tech' end as error_category

Related

Second highest column

SQL update for multiple database entries

Google Cloud datalab error querying BIgQuery tables

Maximum Value from multiple tables

Database design for a step by step wizard

Categories

Resources