Data space after join function is "huge"

Data space after join function is "huge" - sql

My table after using the join function has grown from two tables with 27mb and 37mb to 2930mb. This table size is too large too use further on with my project for me.
I am running on Microsoft SQL Server and after joining the two tables the data space for the table have grown considerably. There are about 40000 rows in the first table and 428 in the second. Number of rows after joining is 37000. This should be correct.
I suspect it might be because of how the data type for the column has been defined. I imported one of the tables from excel (the one with 40000 rows) and therefore the program only took a sample of the first rows to define the datatype. Some of the rows further below in the table exceeded the given data type given (mostly nvarchar(255) when some rows had about 400 characters). Therefore i changed the first column where now almost every column has a field with 300 characters in the forst row. SQL then automatically changed to nvarchar(MAX) for these columns and it worked.
The join function i used is this one:
SELECT * into NewTable
FROM Table1
JOIN Table2
ON Table2.Row1=Table1.Row2;
I would like to have the same table as result, but not larger than 50mb.
The screenshot is from after the join operation

Related

difficulties to fetching data from table

We have a table of 627 columns and approx 850 000 records.
We are trying to retrieve only two columns and dump that data in new table, but the query is taking endless time and we are unable to get the result in new table.
create table test_sample
as
select roll_no, date_of_birth from sample_1;
We have unique index on roll_no column (varchar) and data type for date_of_birth is date.

Your query has no WHERE clause, so it scans the full table. It reads all the columns of every row into memory to extract the columns it needs to satisfy your query. This will take a long time because your table has 627 columns, and I'll bet some of them are pretty wide.
Additionally, a table with that many columns may give you problems with migrated rows or chaining. The impact of that will depend on the relative position of roll_no and date_of_birth in the table's projection.
In short, a table with 627 columns shows poor (non-existent) data modelling. Which doesn't help you now, it's just a lesson to be learned.
If this is a one-off exercise you'll just need to let the query run. (Although you should check whether it is running at all: can you see active progress in V$SESSION_LONGOPS?)

Show data difference in columns of two tables in same database

I am working with SQL Server 2008 and doing data analysis by using different queries. In my database I have 70 columns each in two different tables in same schema. The data in those tables were entered twice. Now I am comparing data of each column and showing records which have differences. Below is my query.
SELECT
[NEEF_Entry].[dbo].[tbl_TOF].Student_Class4_15,
[NEEF_Entry].[dbo].[tbl_TOF_old].Student_Class4_15
FROM
[NEEF_Entry].[dbo].[tbl_TOF]
INNER JOIN
[NEEF_Entry].[dbo].[tbl_TOF_old] ON [NEEF_Entry].[dbo].[tbl_TOF].FormID = [NEEF_Entry].[dbo].[tbl_TOF_old].FormID
WHERE
[NEEF_Entry].[dbo].[tbl_TOF].Student_Class4_15 <> [NEEF_Entry].[dbo].[tbl_TOF_old].Student_Class4_15
The join is based in the form ID which is same in both the tables. Now the column here is Student_Class4_15 in table tbl_TOF and in table tbl_TOF_old which is being compared here and the output is here
It shows what is the difference when data was entered before and after. Now the problem with this is that I have to manually replace column names of 70 columns each time which is time consuming.
What I want is that SQL query should pick all columns and compare them and return results.

I would use except to compare two tables, If the query returns no rows then the data is the same.
SELECT *
FROM table1
EXCEPT
SELECT *
FROM table2;
In case table2 has an extra rows:
SELECT *
FROM table2
EXCEPT
SELECT *
FROM table1;

How can I store records with 500 CLOB fields?

Oracle has a max column limit of 1000 and even with all columns defined as VARCHAR(4000) I was able to create the table and load huge amounts of data in all fields.
I was able to create a table in SQL Server with 500 varchar(max) columns, however when I attempt to insert data, I got the following error:
Cannot create a row of size 13075 which is greater than the allowable
maximum row size of 8060.
When I made the table 200 columns I was able to insert huge amounts of data.
Is there a way to do this in SQL Server?

I ran some test and it seems we have an overhead of 26 bytes on each populated varchar(max) column.
I was able to populate 308 columns.
If you'll divide your columns between 2 tables you'll be fine (until the next limitation - which will come).
P.s.
I seriously doubt the justification for this table structure.
Any reason not saving the data as rows instead of columns?

SSRS 2008 R2 Data Region Embedded in Another Data Region

I have two unrelated tables (Table A and Table B) that I would like to join to create a unique list of pairings of the two. So, each row in Table A will pair with each row in Table B creating a list of unique pairings between the two tables.
My ideas of what can be done:
I can either do this in the query (SQL) by creating one dataset and having two fields outputted (each row equaling a unique pairing).
Or by creating two different datasets (one for each table) and have a data region embedded within a different data region; each data region pulling from a different dataset (of the two created for each table).
I have tried implementing the second method but it would not allow me to select a different dataset for the embedded data region from the parent data region.
The first method I have not tried but do not understand how or even if it is possible through the SQL language.
Any help or guidance in this matter would be greatly appreciated!

The first is called a cross join:
select t1.*, t2.*
from t1 cross join
t2;
Whether you should do this in the application or in the database is open to question. It depends on the size of the tables and the bandwidth to the database -- there is an overhead to pulling rows from a database.
If each table has 2 rows, this is a non-issue. If each table has 100 rows, then you would be pulling 10,000 rows from the database and it might be faster to pull 2*100 rows and do the looping in the application.

Joining two tables that has no columns in common

I am working with two tables with no columns that I can easily join to and get the data I want.
Information about the tables:
I do see something in common in both tables that I might be able to use to join, but I am not sure how it can be done.
Table1: has a column called File_Name. This column captures the imported file location.
example: C:\123\3455\344534\3fjkfj.txt. max Lenth = 200.
Table2: has a column called batch_ID and contains all the records imported by the file listed in table1.
The batch_ID column is exact same thing has the File_Name column in table1.
However, the difference is that it only allows the lenth = 50.
Pretty much it only shows last 50 characters of a filename and directory (50 characters from right to left.)
max lenth = 50
Example: ..\344534\3fjkfj.txt (basically cuts off characters if more than 50 in lenth).
How would I join these tables on those two columns? I know I can create a function and temp tables, but how can I do it without it?
Thanks!

Select Columns
From Table1
Inner Join Table2
On Right(Table1.ColumnA, 50) = Right(Table2.ColumnB, 50)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Data space after join function is "huge" - sql

Related

difficulties to fetching data from table

Show data difference in columns of two tables in same database

How can I store records with 500 CLOB fields?

SSRS 2008 R2 Data Region Embedded in Another Data Region

Joining two tables that has no columns in common

Categories

Resources