Access SQL database - ORDER BY - sql

I'm using an MRP system for stocking inventory where I work. The interface it self isn't the best, so I have decided to open up the database file and do everything manually. I'm having some issues though. I'm trying to sort my database by using ORDER BY. I'm not getting the results I thought I would. It is showing them in this format:
1
10
100
101
101
11
110
111
etc
Instead of
1
2
3
4
5
etc
This is my query
SELECT *
FROM tblStockItems
Order By (`MasterPNo`)
I'm currently working in access, and then database is in the JET format. If you're wondering why I am using access instead of the MRP Interface, it is because later down the line I will be needing to re-organise the whole stock system, so a lot of fields will have their product numbers changed.
Thanks for reading

if possible, change the column type to number
if not, a cast should do it:
ORDER BY Val(MasterPNo)

Related

operation must use an updateable query error Access [duplicate]

This question already has an answer here:
Why is my query not updateable?
(1 answer)
Closed 5 years ago.
I have a master table tblBudget which contains entries like
ProjID Type Budget Active
101 ROM 100 No
101 PLE 110 No
101 DLE 120 Yes
102 ROM 200 No
102 PLE 210 Yes
Every month I get an excel which i import and store into a temp table tblMonthlyBudget that contains entries like
ProjID Type Budget Active
101 EAC 100 Yes
102 DLE 110 Yes
I wrote an update query that tries to update all the Active entries in tblBudget to No so that new records which are the most active could then be inserted. My query is
UPDATE tblBudget
INNER JOIN tblMonthlyBudget
ON tblBudget.ProjectId = tblMonthlyBudget.ProjectID
SET tblBudget.Active = false
However I get the error
operation must use an updateable query
even though the query seems to show correctly in the datasheet and design view. I get it only while executing the query . I tried searching for the error and have tried all sorts of combinations without success. Any alternative approach is welcome.
I suspect this is because the temporary table contains more than one records.
I have tried to replicate your problem using the data and table structures, you provided. But the UPDATE works as I would expect. It will still work if there are repeated entries in tblMonthlyBudget - it just updates an excessive number of times. Please read mcve and alter your question with data and table structure that actually produces your error. Otherwise we have almost zero hope of helping you.

How to sort string data that represents numbers

My client has a set of numeric data stored in a string field in a database. So of course it doesn't sort correctly. These rows sort like this:
105
3
44
When they should sort like this:
3
44
105
This is very much a legacy database and I can't change it at all. I also can't change the software that uses the database. The client doesn't own it or have the source code. It has never worked the way they want. However, there is an unused string field that I could use to sort on (only a small number of fields can be sorted on.)
What I would like to do is take the input data, derive a string from it, and store the new string in the unused field, such that when the data is sorted on this new data, the original data sorts correctly, i.e., numerically.
So, for an overly simplistic example, if the algorithm produced the following new data:
105 -> c
3 -> a
44 -> b
Then when the second column was sorted, the first column would look 'correct'.
The tricky bit is that when new rows are added to the database, they must also sort correctly, without having to regenerate the sort data for all rows. This is the part of the problem that has my brain in a twist. I'm not sure it's actually possible.
You can assume that the number will never be more than 5 'digits'.
I realize this is a total kludge, but since I can't change the system, I have to find a work around, rather than a quality solution. Welcome to the real world.
~~~~~~~~~~~~~~~~~~~~~~ S O L U T I O N ~~~~~~~~~~~~~~~~~~
I don't think this is an uncommon problem, so here are the results of Gordon's solution:
mysql> select * from t order by new;
+------+------------+
| orig | new |
+------+------------+
| 3 | 0000000003 |
| 44 | 0000000044 |
| 105 | 0000000105 |
+------+------------+
In most databases, you can just do:
order by cast(col as int)
This will convert the string representation to a number and use that for ordering. There is no need for an additional column. If you add one, I would recommend adding a numeric column to contain the actual value.
If you really want to store something in the unused field, then you can left pad the number. How to do this depends on the database, but here is one typical method:
update t
set unused = right(concat('0000000000', col), 10);
Not all databases support these two specific functions, but all offer this basic functionality in some method.
Try something like
SELECT column1 FROM table1 ORDER BY LENGTH(column1) ASC, column1 ASC
(Adjust the column and table name for your environment.)
This is a bit of a hack but works as long as the "numbers" in your string column are natural, non-negative numbers only.
If you are looking for a more sophisticated approach or algorithm, try searching for natural sort together with your DBMS.

Is there a way to apply a moving limit in SQL>

I have a large database I use for plotting and data examination. For simplicity, say it looks something like this:
| id | day | obs |
+----------+-----------+-----------+
| 1 | 500 | 4.5 |
| 2 | 500 | 4.4 |
| 3 | 500 | 4.7 |
| 4 | 500 | 4.8 |
| 5 | 600 | 5.1 |
| 6 | 600 | 5.2 |
...
This could be stock market data, where we have many points per day that are measured.
What I want to do is look at much longer trends, where the multiple points per day are unnecessarily resolved, and clog my plotting application. (I want to look at 30000 days, each has about 100 observations).
Is there a way to do something like SELECT ... LIMIT 1 PER "day"
I suppose I could perform a few SELECT DISTINCT queries to find correct ID's, but I'd rather do something simple if it is built in.
It doesn't matter if its the first, last, or an average value per day. Just a single value. I just prefer what is fastest.
Also, this I'd like to do this for Postgres, MySQL, and SQLite. My application is built to use all three and I frequently switch between them.
Thanks!
Background: This is for a Ruby on Rails plotting application, so a trick with ActiveRecord will work too. https://github.com/ZachDischner/Rails-Plotter
You need to tag your question with the brand of RDBMS you're using. Frequently for Rails developers, they're using MySQL, but the answer to your question depends on this.
For all brands except for MySQL, the correct and standard solution is to use windowing functions:
SELECT * FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY day) AS RN, *
FROM stockmarketdata
) AS t
WHERE t.RN = 1;
For MySQL, which doesn't support windowing functions yet, you can simulate them in a kind of clumsy way with session variables:
SELECT * FROM (SELECT #day:=0, #r:=0) AS _init,
(
SELECT IF(day=#day, #r:=#r+1, #r:=0) AS RN, #day:=day AS d, *
FROM stockmarketdata
) AS t
WHERE t.RN = 1
You left a lot of room for options with your statement:
It doesn't matter if its the first, last, or an average value per day. Just a single value. I just prefer what is fastest.
So, I'm going to leave the id out of it and first propose going with average of obs for each group as the simplest and probably the most practical, though maybe not the fastest to be running stat functions vs. limit:
MyModel.group(:day).average(:obs)
If you wanted the minimum:
MyModel.group(:day).minimum(:obs)
If you wanted the maximum:
MyModel.group(:day).maximum(:obs)
(Note: The following 2 examples are less efficient than just entering the SQL, but might be more portable.)
But you might want all three:
ActiveRecord::Base.connection.execute(MyModel.select('MIN(obs), AVG(obs), MAX(obs)').group(:day).to_sql).to_a
Or just the data without hashes:
ActiveRecord::Base.connection.exec_query(MyModel.select('MIN(obs), AVG(obs), MAX(obs)').group(:day).to_sql)
If you want median, see this question which is more DB specific, and there are other related posts about it if you search.
And for more, some DB's like postgres have variance(...), stddev(...), etc. built-in.
Finally, check out the query section in the Rails guide and ARel for more info on constructing queries. You can do a limit in an ActiveRecord relation via first or limit for example, and in ARel, take lets you do a limit. Subqueries are possible too, as shown in answers to this question, and so is group by, etc. If you are sharing this project with others, try to limit the amount of non-portable SQL you are using unless you plan on adding support for other databases on your own and maintaining that.

ORM Select n + 1 performance; join or no join

There are similar questions to this, but I don't think anyone has asked this particular question.
Scenario:
Customer - Order (where Order has a CustomerID) - OrderPart - Part
I want a query that returns a customer with all its orders and each order with its parts.
Now I have two main choices:
Use a nested loop (which produces separate queries)
Use data loading options (which produces a single query join)
The question:
Most advice and examples on ORMs suggest using option 2 and I can see why. However, option 2 will potentially be sending back a huge amount of duplicated data, eg:
Option 1 results (3 queries):
ID Name Country
1 Customer1 UK
ID Name
1 Order1
2 Order2
ID Name
1 Part1
2 Part2
3 Part3
Option 2 results (1 query):
ID Name Country ID Name ID Name
1 Customer1 UK 1 Order1 1 Part1
1 Customer1 UK 1 Order1 2 Part2
1 Customer1 UK 1 Order1 3 Part3
1 Customer1 UK 2 Order2 1 Part1
1 Customer1 UK 2 Order2 2 Part2
Option 1 sends back 13 fields with 3 queries. Option 2 sends back 42 fields in 1 query. Now imagine Customer table has 30 fields and Orders have more complex sub joins, the data duplication can quickly become huge.
What impact on overall performance do the following things have:
Overhead of making a database connection
Time taken to send data (potentially across network if on different server)
Bandwidth
Is option 2 always the best choice, option 1 the best choice or does it depend on the situation? If it depends, what criteria should you use to determine? Are any ORMs clever enough to work it out for themselves?
Overhead of making a database connection
Very little if they are on the same subnet, which they usually are. If they're not then this is still not a huge overhead and can be overcome with caching, which most ORMs have (NHibernate has 1st and 2nd level caching).
Time taken to send data (potentially across network if on different server)
For SELECT N+1 this will obviously be longer as it will have to send the select statement each time, which might be up to 1k long. It will also have to grab a new connection from the pool. Chatty versus chunky use to be an argument around 2002-2003 but now it really doesn't make a huge difference unless this is a really big application, in which case you will probably want a more experienced (or better paid) pundit giving his views - i.e. a consultant.
I would favour joins however, as databases will be optimised for this usage over their 10 or more years of development. If performance is really slow a View can sort this out, or Stored Procedure.
By the way, SELECT N+1 is probably the commonest performance problem people experience with NHibernate when they first start using it (including me), and is something that actually takes tweaking to sort out. This is because NHibernate is to ORMs what C++ is to languages.
Bandwidth
An extra SELECT statement for every Customer will eventually build up to however many Customer objects * Orders. So for a large system this might be noticeable - but as I mentioned, ORMs usually have caching mechanisms in place to negate this problem. The amount of SELECT statements also isn't going to be that huge considering:
You're on the same network as the SQL server most of the time
The increased amount of bytes account for about an extra 0.5-50k of extra bandwidth? Think how fast that is on most servers.
a great deal of this is going to depend on the amount of data you are going through.
The join, while returning more fields, is going to run markedly faster (as a rule) than the Option 1 set of queries.
From my personal experience, slow-downs are almost always at that level, the actual running of the query, not the sheer amount of data being passed along whatever pipe you have.

Best way to use hibernate for complex queries like top N per group

I'm working now for a while on a reporting applications where I use hibernate to define my queries. However, more and more I get the feeling that for reporting use cases this is not the best approach.
The queries only result partial columns, and thus not typed objects
(unless you cast all fields in java).
It is hard to express queries without going straight into sql or
hql.
My current problem is that I want to get the top N per group, for example the last 5 days per element in a group, where on each day I display the amount of visitors.
The result should look like:
| RowName | 1-1-2009 | 2-1-2009 | 3-1-2009 | 4-1-2009 | 5-1-2009
| SomeName| 1 | 42 | 34 | 32 | 35
What is the best approach to transform the data which is stored per day per row to an output like this? Is it time to fall back on regular sql and work with untyped data?
I really want to use typed objects for my results but java makes my life pretty hard for that. Any suggestions are welcome!
Using the Criteria API, you can do this:
Session session = ...;
Criteria criteria = session.createCriteria(MyClass.class);
criteria.setFirstResult(1);
criteria.setMaxResults(5);
... any other criteria ...
List topFive = criteria.list();
To do this in vanilla SQL (and to confirm that Hibernate is doing what you expect) check out this SO post: