How to repeat records in access table number of times - vba

I need your assistance to figure out how to achieve the following in MS access database.
I have a table with a lot of columns but one of them has a numeric value that will be used as how many times will the record will be repeated.
I need to make another table with repeated records based on Count column.

Build a numbers (aka tally) table (you can google it). I'll call it tblNumbers. Then all you need to do is create a query SELECT <yourTable>.* FROM <yourTable>, tblNumbers WHERE tblNumbers.Number <= <yourTable>.<numberField>

Related

Selecting a large number of rows by index using SQL

I am trying to select a number of rows by the value of a column called ID. I know you can do this pretty easily by:
SELECT col1, col2, col3 FROM mytable WHERE id IN (1,2,3,4,5...)
However, what if there are a few million IDs I want to select and the IDs don't always have pattern (which means I can't use something like BETWEEN x AND y)? Does this select statement still work or is there better ways of doing so?
The actual application is this. Filters are specified by users, which is compared to some attributes of the records. From those filters, we create a subset of the data which is of interest to a particular user. There are about 30 million records each with roughly ~3000 attributes (which is stored in roughly 30 tables, but every table has ID as a primary key), so every time someone makes a query about their desired subset of records, we'd have to join many tables, apply those filters, and figure out what his subset looks like. In order to avoid joining many tables all the time, I thought maybe it's a better idea to join the tables once, figure out the id of the selected subset, and this way each time a new query is made, all we have to do is select the relevant columns of the rows that match the filtered ids.
This depends on the database and the interface you are using. For a few hundred or thousand values, no problem. But your question specifies millions. And that could start to get into limits on the length of the query -- either specified by the database, the tool you are using, or intermediate libraries.
If you have so many ids, I would strongly recommend that you load them into a table in the database with the id as the primary key. Then use join or exists to identify the rows in your table that match.
Often, such a list would be generated in the database anyway. In that case, you can use a subquery or CTE and just include that code in your final query.

SQL to identify duplicate columns from table having hundreds of column

I've 250+ columns in customer table. As per my process, there should be only one row per customer however I've found few customers who are having more than one entry in the table
After running distinct on entire table for that customer it still returns two rows for me. I suspect one of column may be suffixed with space / junk from source tables resulting two rows of same information.
select distinct * from ( select * from customer_table where custoemr = '123' ) a;
Above query returns two rows. If you see with naked eye to results there is not difference in any of column.
I can identify which column is causing duplicates if I run query every time for each column with distinct but thinking that would be very manual task for 250+ columns.
This sounds like very dumb question but kind of stuck here. Please suggest if you have any better way to identify this, thank you.
Solving this one-time issue with sql is too much effort. Simply copy-paste to excel, transpose data into columns and use some simple function like "if a==b then 1 else 0".

select only specific number of columns from Table - Hive

How to select only specific number of columns from a table in hive. For Example, If I have Table with 50 Columns, then how Can I just select first 25 columns ? Is there any easy way to do it rather than hard coading the column names.
I guess that you're asking about using the order in which you defined your columns in your CREATE TABLE statement. No, that's not possible in Hive for the moment.
You could do the trick by adding a new column COLUMN_NUMBER and use that in your WHERE statements, but in that case I would really think twice of the trade off between spending some more time typing your queries and messing your whole table design by adding unnecessary columns. Apart from the fact that if you need to change your table schema in the future (for instance, by adding a new column), adapting your previous code with different column numbers would be painful.

Adding a column to existing table in access without using any relationship

I am working on a project in ACCESS 2010 that requires me to give a rank to 30000 products based on their sales. I have tried using a query to do the ranking and it takes a long time. (Please find codes at Making the ranking query efficient)
I figured out that instead, I can just sort the table based on the Sales column and add a field with numbers 1 to 30000.
So is there a way to add such a column, i.e. a column without any relationship to the existing table.
Add a field to the actual table? If that's the case, make a table and run this query:
ALTER TABLE yourTableName
ADD COLUMN yourColumnName AUTOINCREMENT(1, 1)

Is there any reason this simple SQL query should be so slow?

This query takes about a minute to give results:
SELECT MAX(d.docket_id), MAX(cus.docket_id) FROM docket d, Cashup_Sessions cus
Yet this one:
SELECT MAX(d.docket_id) FROM docket d UNION MAX(cus.docket_id) FROM Cashup_Sessions cus
gives its results instantly. I can't see what the first one is doing that would take so much longer - I mean they both simply check the same two lists of numbers for the greatest one and return them. What else could it be doing that I can't see?
I'm using jet SQL on an MS Access database via Java.
the first one is doing a cross join between 2 tables while the second one is not.
that's all there is to it.
The first one uses Cartesian product to form a source data, which means that every row from the first table is paired with each row from the second one. After that, it searches the source to find out the max values from the columns.
The second doesn't join tables. It just find max from the fist table and the max one from the second table and than returns two rows.
The first query makes a cross join between the tables before getting the maximums, that means that each record in one table is joined with every record in the other table.
If you have two tables with 1000 items each, you get a result with 1000000 items to go through to find the maximums.