Why such a simple BufWriter operation didn't work

Why such a simple BufWriter operation didn't work - rust-tokio

The following code is very simple. Open a file as a write, create a BufWriter using the file, and write a line of string.
The program reports no errors and returns an Ok(10) value, but the file just has no content and is empty.
#[tokio::test]
async fn save_file_async() {
let path = "./hello.txt";
let inner = tokio::fs::OpenOptions::new()
.create(true)
.write(true)
//.truncate(true)
.open(path)
.await
.unwrap();
let mut writer = tokio::io::BufWriter::new(inner);
println!(
"{} bytes wrote",
writer.write("1234567890".as_bytes()).await.unwrap()
);
}

Need an explicit flush:
writer.flush().await.unwrap();

Related

AWS s3 in Rust: Get and store a file - Invalid file header when opening

What I want to do: Download an S3 file (pdf) in a lambda and extract its text, using Rust.
The Error:
ERROR PDF error: Invalid file header
I checked the pdf file in the bucket, downloaded it from the console and everything looks correct, so something is breaking in the way I store the file.
How I am doing it:
let config = aws_config::load_from_env().await;
let client = s3::Client::new(&config);
// Get uploaded object in raw bucket (serde derived the json)
let key = event.records.get(0).unwrap().s3.object.key.clone();
let key = key.replace('+', " ");
let key = percent_encoding::percent_decode_str(&key).decode_utf8().unwrap().to_string();
let content = client
.get_object()
.bucket(raw_bucket_name)
.key(&key)
// .response_content_type("application/pdf") // this did not make any difference
.send()
.await?;
let mut bytes = content.body.into_async_read();
let file = tempfile::NamedTempFile::new()?;
let path = file.into_temp_path();
let mut file = tokio::fs::File::create(&path).await?;
tokio::io::copy(&mut bytes, &mut file).await?;
let content = pdf_extract::extract_text(path)?; // this line breaks
Versions:
tokio = { version = "1", features = ["macros"] }
aws-sdk-s3 = "0.21.0"
aws-config = "0.51.0"
pdf-extract = "0.6.4"
I feel like I misunderstood something in how to store the bytestream, but e.g. https://stackoverflow.com/a/62003659/4986655 do it in the same way afaiks.
Any help or pointers on what the issue might be or how to debug this are very welcome.

Queries in delta tables

I'm trying to read some tables from delta-lake stored in a S3 Bucket, using delta-rs in rust.
When I ran the code, it seems to be opening the table, because when I print the table it returns the following:
version: 0
metadata: GUID=36348853-e380-4d3d-986f-034b1cd7bcd2, name=None, description=None, partitionColumns=[], createdTime=1632167494225, configuration={}
min_version: read=1, write=2
files count: 1
But when I try to query on it, it returns the following message:
Parquet reader thread terminated due to error:
IoError(Os { code: 2, kind: NotFound, message: "No such file or directory" }
Here is my code, and I don't have any idea what I can do to solve this issue:
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
env::set_var("AWS_ACCESS_KEY_ID", AWS_ACCESS_KEY_ID);
env::set_var("AWS_SECRET_ACCESS_KEY", AWS_SECRET_ACCESS_KEY);
let web_site_request = GetBucketWebsiteRequest{bucket: S3_TEST_BUCKET.to_string(), expected_bucket_owner: None};
let table_uri = "s3://dev-evandro/common/lakehouse-sync/parquet/payments/chargebee/customer/";
let be = storage::get_backend_for_uri(table_uri).unwrap();
let table = deltalake::open_table(table_uri).await.unwrap();
println!("{}", table);
let mut ctx = ExecutionContext::new();
ctx.register_table("test_table", Arc::new(table))?;
let batches = ctx
.sql("SELECT * FROM test_table LIMIT 1")?
.collect()
.await?;
let batch = pretty_format_batches(&[batches][0]).unwrap();
println!("{}", batch);
Ok(())
}

Rust macro to generate multiple individual tests

Is it possible to have a macro that generates standalone tests? I have two text files, one with an input and another with an output. Each new line in the text file represents a new test.
Currently, this is how I run my tests:
#[test]
fn it_works() {
let input = read_file("input.txt").expect("failed to read input");
let input = input.split("\n").collect::<Vec<_>>();
let output = read_file("output.txt").expect("failed to read output");
let output = output.split("\n").collect::<Vec<_>>();
input.iter().zip(output).for_each(|(a, b)| {
println!("a: {}, b: {}", a, b);
assert_eq!(b, get_result(a));
})
But, as you can see, if one test fail, all of them fail, since there's a loop inside a single test. And I need each iteration to be a single and isolated test, without having to repeat myself.
So I was wondering if it's possible to achieve that by using macros?
The macro ideally would output something like:
#[test]
fn it_works_1() {
let input = read_file("input.txt").expect("failed to read input");
let input = input.split("\n").collect::<Vec<_>>();
let output = read_file("output.txt").expect("failed to read output");
let output = output.split("\n").collect::<Vec<_>>();
assert_eq!(output[0], get_result(input[0])); // first test
}
#[test]
fn it_works_2() {
let input = read_file("input.txt").expect("failed to read input");
let input = input.split("\n").collect::<Vec<_>>();
let output = read_file("output.txt").expect("failed to read output");
let output = output.split("\n").collect::<Vec<_>>();
assert_eq!(output[1], get_result(input[1])); // second test
}
// ... the N remaining tests: it_works_n()

You can't do this with a declarative macro because a declarative macro cannot generate an identifier to name the test functions. However you can use a crate such as test-case, which can run the same test with different inputs:
use test_case::test_case;
#[test_case(0)]
#[test_case(1)]
#[test_case(2)]
#[test]
fn it_works(index: usize) {
let input = read_file("input.txt").expect("failed to read input");
let input = input.split("\n").collect::<Vec<_>>();
let output = read_file("output.txt").expect("failed to read output");
let output = output.split("\n").collect::<Vec<_>>();
assert_eq!(output[index], get_result(input[index])); // first test
}
If you have a lot of different inputs to test, you could use a declarative macro to generate the code above, which would add all of the #[test_case] annotations.

After Peter Hall answer, I was able to achieve what I wanted. I added the seq_macro crate to generate the repeated #[test_case]'s. Maybe there's a way to loop through all test cases instead of manually defining the amount of tests (like I did), but this is good for now:
macro_rules! test {
( $from:expr, $to:expr ) => {
#[cfg(test)]
mod tests {
use crate::{get_result, read_file};
use seq_macro::seq;
use test_case::test_case;
seq!(N in $from..$to {
#(#[test_case(N)])*
fn it_works(index: usize) {
let input = read_file("input.txt").expect("failed to read input");
let input = input.split("\n").collect::<Vec<_>>();
let output = read_file("output.txt").expect("failed to read output");
let output = output.split("\n").collect::<Vec<_>>();
let res = get_result(input[index]);
assert_eq!(
output[index], res,
"Test '{}': Want '{}' got '{}'",
input[index], output[index], res
);
}
});
}
};
}
test!(0, 82);

Rebuild SQL database through command line in swift

Having a very difficult time trying to run command line arguments through Swift. I need to run commands on SQL files that a user manually drags onto the app (so the file path is different every time).
The piping between my app and the command line is working (sending 'pwd' will return the correct response), but when I try sending the arguments I want I cannot get them to work. I have tried using both "bin/bash" and "usr/bin/env" to no avail.
Essentially I am trying to rebuild a database that has been corrupted, without having to go in through terminal and do it myself. Common errors I see across attempts include 'Launch path not accessible' or 'File or directory not found'. I have tried using 'chmod 6' through terminal to set the permissions on the file, but this still does not work. Any help on what I am doing wrong to access the file, or another way to try and rebuild a database, would be greatly appreciated.
func checkForCorruption(filePath: URL) -> (String?, Bool){
let folder = filePath.deletingLastPathComponent()
let arguments = ["cd \(folder.relativePath)", "sqlite3 Restaurant.sql", ".mode insert",".output dump.sql",".dump", ".exit"]
let task = Process()
task.launchPath = "bin/bash/"
task.arguments = arguments
let inPipe = Pipe()
task.standardInput = inPipe
let pipe = Pipe()
task.standardOutput = pipe
let errPipe = Pipe()
task.standardError = errPipe
var output : [String] = []
task.launch()
task.waitUntilExit()
let data = pipe.fileHandleForReading.readDataToEndOfFile()
let errData = errPipe.fileHandleForReading.readDataToEndOfFile()
if let out = NSString(data: data, encoding: String.Encoding.utf8.rawValue){
print(out)
}
if let errOut = NSString(data: errData, encoding: String.Encoding.utf8.rawValue){
print("error: \(errOut)")
}
let outHandle = pipe.fileHandleForReading
if var string = String(data: data, encoding: .utf8) {
string = string.trimmingCharacters(in: .newlines)
output = string.components(separatedBy: "\n")
do {
try string.write(toFile: "\(folder.relativePath)/dump.sql", atomically: true, encoding: String.Encoding.utf8)
}
catch _ {
print("something went wrong")
}
}
outHandle.readabilityHandler = { pipe in
print("reading")
if let line = String(data: pipe.availableData, encoding: String.Encoding.utf8) {
print("New ouput: \(line)")
} else {
print("Error decoding data: \(pipe.availableData)")
}
}
return ("", false)
}

I got some help at work, for anyone struggling with this, here is the answer (the print statement is just were the dump file is located).
let arguments = ["\(filePath.relativePath)", ".mode insert",".output dump.sql",".dump", ".exit"]
let task = Process()
task.launchPath = "/usr/bin/sqlite3"
print(FileManager.default.currentDirectoryPath)

How to execute external program from Swift?

I am new in Swift and I did not found anything about executing external programs or access external processes using Swing language.
Is it possible to do in the current stage of the language development or I should use Objective-C instead?
Maybe there are some Objective-C libraries that can be used inside my Swift program?
Thanks.

You can run external programs using NSTask. For example, from Circle and Square:
import Foundation
func executeCommand(command: String, args: [String]) -> String {
let task = NSTask()
task.launchPath = command
task.arguments = args
let pipe = NSPipe()
task.standardOutput = pipe
task.launch()
let data = pipe.fileHandleForReading.readDataToEndOfFile()
let output: String = NSString(data: data, encoding: NSUTF8StringEncoding)
return output
}
let commandOutput = executeCommand("/bin/echo", ["Hello, I am here!"])
println("Command output: \(commandOutput)")

Improved version of Rob's answer (in that you don't need to specify the full path of your executable), and also updated for Swift 3:
import Foundation
func execCommand(command: String, args: [String]) -> String {
if !command.hasPrefix("/") {
let commandFull = execCommand(command: "/usr/bin/which", args: [command]).trimmingCharacters(in: CharacterSet.whitespacesAndNewlines)
return execCommand(command: commandFull, args: args)
} else {
let proc = Process()
proc.launchPath = command
proc.arguments = args
let pipe = Pipe()
proc.standardOutput = pipe
proc.launch()
let data = pipe.fileHandleForReading.readDataToEndOfFile()
return String(data: data, encoding: String.Encoding.utf8)!
}
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Why such a simple BufWriter operation didn't work - rust-tokio

Need an explicit flush: writer.flush().await.unwrap();

Related

AWS s3 in Rust: Get and store a file - Invalid file header when opening

Queries in delta tables

Rust macro to generate multiple individual tests

Rebuild SQL database through command line in swift

How to execute external program from Swift?

Categories

Resources