I have a table in my database that I need to loop through and make and API call. It's only three columns but about 300,000 rows. I need to grab all three columns and pass it in the requests.get() call.
Is there a way of doing that without putting into an RDD? Or if putting it into an RDD is the only way, how can it be done? Sorry a bit new to spark so I'm working on my own.
Related
using nhibernate and getNamedQuery, how can I handle a huge result set, without load all rows to a list?
Let's suppose I need to do some logic with each row, but I don't need it after use it.
I don't want to use pagination, because I can't change the source (a stored procedure).
When you say you don't want to use pagination, do you consider using setMaxResults() and setFirstResult() pagination?
Query query = session.getNamedQuery("storedProc");
query.setFirstResult(fromIndex);
query.setMaxResults(pageSize);
I'm not really sure you have any other option with Hibernate. There are all sorts of ways you can incorporate partitioning to split the work, but that's only after you've loaded the data that you want to partition.
Assuming you're doing some sort of logic against a row and storing in the DB, and seeing as you're already using stored procs, your best bet may be another stored proc. Alternatively, if you want to keep your logic outside of your DB, then you should be working off of tables rather than data from a stored proc.
I need to return the rows of some tables via RFC, and the names of these tables are not known before the execution.
I have this statement which gets executed in a loop:
SELECT *
up to iv_max_count rows
into table <lt_result>
FROM (iv_table_name) AS ltab
WHERE (SQL_WHERE).
How can I concatenate the <lt_result> results to one list/table and return this via RFC?
Of course all tables can have different structures. Creating one big table which holds all rows does not help.
You can't return an arbitrary structure or structures in an RFC, they have to be predefined.
The best way I can think of to do this is to mimic the way SAP handles idocs in the database. Your table would need a minimum of two fields, the first would be a descriptor field telling the caller what the table structure is, and the second field would be a very long character type field with all the data concatenated together, either fixed width or delimited. This way, you could pass data from multiple tables in the same return structure.
If your calling program really knows nothing about SAP data sets, you would probably also need to grab metadata from table DD02l.
In short, that's not how ABAP and function modules work.
You have to define exactly what your input is and what your output structure/table looks like. You can return one structure that holds multiple deep nested tables, to have only one return structure, but not dynamically!
Making this all dynamic makes things a lot more complex. Mostly unnecessarily.
One possible way:
you have to anaylize the input and build dynamic structures and tables for each input table result
build a wrapping structure that consists of all the nested tables
return a DATA reference object, because you cannot return generic datatypes
your receiving program needs to have the same data structures defined, this means it must exactly know what it is getting back, to defer the data.
Another way:
Use Function Module RFC_READ_TABLE in a loop in the caller program
Reading multiple single tables dynamically in a loop without a join does not sound like ABAP programming, more like "I need data from SAP in a third party tool".
A little background first. I need to write a stored procedure that will grab everything in a table where the id is NOT in a list of other ids. Reading the Azure SQL docs it indicated that if you're going to have a large list of items in an IN or NOT IN clause then you should consider storing those items in a temporary table. So I figured I'd do that as I don't know how many potential items may be in this list. In my Azure Functions (C# code) I will have this list of IDs that I want to be placed into that temporary table. I'm not sure of the best way to do this.
I could not use a stored procedure and write the query in my Azure Functions and use a transaction and a for loop to insert each item into that temporary table (I think at least, I'm not very well-versed in this topic). I read that when inserting a lot of items it's best to batch it by putting it in one transaction.
If I do use a stored procedure, how do I do this? I have the list of IDs in my C# code but I don't know how I'd pass that in to the stored procedure (just one giant array?) or if there's any limits on how many items I could even pass in. And then in the stored procedure, how would it go about inserting this list of IDs that was passed in? Is there some sort of for loop syntax?
Hopefully there's a way to do this that is somewhat efficient.
I am not sure how to index List/array in apache ignite. I want to use my list/array in where clause, I can write custom function but it will search all the data set, But I am looking for indexing of list/array.
Please help me.
A common way to store lists in SQL database is to create a table of pairs, representing one-to-many relation.
Columns of this table of pairs can be indexed and used in where clauses after joining with the initial table.
To make joins work fast, you will probably need to make records of these two tables collocated by affinity.
This question already has answers here:
Pass table valued parameter using ADO.NET
(5 answers)
Closed 9 years ago.
I needed help in deciding the recommended approach to insert a Array/List into database(Sql Server 2008) using a stored procedure .
The obvious approach would be to invoke the sp for each row of the array.
But there has to be a better approach to send over the list/array over the network only once.
Help is appreciated.
The best approach here would depend on how you would want to query the data:
If you might want to be able to query aspects of each row in the List/Array, then inserting one Db row per one List/Array item is the best way to go.
If you are not going to want to query items within the List/Array, and just want to save it for auditing purposes, or some other referential need where you would not never need to retrieve anything but the full set, then you could get away with serializing the whole thing to JSON and storing it in a Key/Value table where value is of type nvarchar(max).
But there has to be a better approach to send over the list/array over the network only
once.
Assuming you have the data in a parameter, use a structured parameter. SP's can accept table valued parametes.
As such, this is a duplicate of....
Pass table valued parameter using ADO.Net
which also has the solution.