Is converting a file to a byte array the best way to save ANY file format to disk or database var binary column?
So if someone wants to save a .gif or .doc/.docx or .pdf file, can I just convert it to a bytearray UFT8 and save it to the db as a stream of bytes?
I'll describe the way I've stored files, in SQL Server and Oracle. It largely depends on how you are getting the file, in the first place, as to how you will get its contents, and it depends on which database you are using for the content in which you will store it for how you will store it. These are 2 separate database examples with 2 separate methods of getting the file that I used.
SQL Server
Short answer: I used a base64 byte string I converted to a byte[]
and store in a varbinary(max)
field.
Long answer:
Say you're uploading via a website, so you're using an <input id="myFileControl" type="file" />
control, or React DropZone. To get the file, you're doing something like var myFile = document.getElementById("myFileControl")[0];
or myFile = this.state.files[0];
.
From there, I'd get the base64 string using code here: Convert input=file to byte array (use function UploadFile2
).
Then I'd get that string, the file name (myFile.name
) and type (myFile.type
) into a JSON object:
var myJSONObj = {
file: base64string,
name: myFile.name,
type: myFile.type,
}
and post the file to an MVC server backend using XMLHttpRequest, specifying a Content-Type of application/json
: xhr.send(JSON.stringify(myJSONObj);
. You have to build a ViewModel to bind it with:
public class MyModel
{
public string file { get; set; }
public string title { get; set; }
public string type { get; set; }
}
and specify [FromBody]MyModel myModelObj
as the passed in parameter:
[System.Web.Http.HttpPost] // required to spell it out like this if using ApiController, or it will default to System.Mvc.Http.HttpPost
public virtual ActionResult Post([FromBody]MyModel myModelObj)
Then you can add this into that function and save it using Entity Framework:
MY_ATTACHMENT_TABLE_MODEL tblAtchm = new MY_ATTACHMENT_TABLE_MODEL();
tblAtchm.Name = myModelObj.name;
tblAtchm.Type = myModelObj.type;
tblAtchm.File = System.Convert.FromBase64String(myModelObj.file);
EntityFrameworkContextName ef = new EntityFrameworkContextName();
ef.MY_ATTACHMENT_TABLE_MODEL.Add(tblAtchm);
ef.SaveChanges();
tblAtchm.File = System.Convert.FromBase64String(myModelObj.file);
being the operative line.
You would need a model to represent the database table:
public class MY_ATTACHMENT_TABLE_MODEL
{
[Key]
public byte[] File { get; set; } // notice this change
public string Name { get; set; }
public string Type { get; set; }
}
This will save the data into a varbinary(max)
field as a byte[]
. Name
and Type
were nvarchar(250)
and nvarchar(10)
, respectively. You could include size by adding it to your table as an int
column & MY_ATTACHMENT_TABLE_MODEL
as public int Size { get; set;}
, and add in the line tblAtchm.Size = System.Convert.FromBase64String(myModelObj.file).Length;
above.
Oracle
Short answer: Convert it to a byte[]
, assign it to an OracleParameter
, add it to your OracleCommand
, and update your table's BLOB
field using a reference to the parameter's ParameterName
value: :BlobParameter
Long answer:
When I did this for Oracle, I was using an OpenFileDialog
and I retrieved and sent the bytes/file information this way:
byte[] array;
OracleParameter param = new OracleParameter();
Microsoft.Win32.OpenFileDialog dlg = new Microsoft.Win32.OpenFileDialog();
dlg.Filter = "Image Files (*.jpg, *.jpeg, *.jpe)|*.jpg;*.jpeg;*.jpe|Document Files (*.doc, *.docx, *.pdf)|*.doc;*.docx;*.pdf"
if (dlg.ShowDialog().Value == true)
{
string fileName = dlg.FileName;
using (FileStream fs = File.OpenRead(fileName)
{
array = new byte[fs.Length];
using (BinaryReader binReader = new BinaryReader(fs))
{
array = binReader.ReadBytes((int)fs.Length);
}
// Create an OracleParameter to transmit the Blob
param.OracleDbType = OracleDbType.Blob;
param.ParameterName = "BlobParameter";
param.Value = array; // <-- file bytes are here
}
fileName = fileName.Split('\\')[fileName.Split('\\').Length-1]; // gets last segment of the whole path to just get the name
string fileType = fileName.Split('.')[1];
if (fileType == "doc" || fileType == "docx" || fileType == "pdf")
fileType = "application\\" + fileType;
else
fileType = "image\\" + fileType;
// SQL string containing reference to BlobParameter named above
string sql = String.Format("INSERT INTO YOUR_TABLE (FILE_NAME, FILE_TYPE, FILE_SIZE, FILE_CONTENTS, LAST_MODIFIED) VALUES ('{0}','{1}',{2},:BlobParamerter, SYSDATE)", fileName, fileType, array.Length);
// Do Oracle Update
RunCommand(sql, param);
}
And inside the Oracle update, done with ADO:
public void RunCommand(string sql, OracleParameter param)
{
OracleConnection oraConn = null;
OracleCommand oraCmd = null;
try
{
string connString = GetConnString();
oraConn = OracleConnection(connString);
using (oraConn)
{
if (OraConnection.State == ConnectionState.Open)
OraConnection.Close();
OraConnection.Open();
oraCmd = new OracleCommand(strSQL, oraConnection);
// Add your OracleParameter
if (param != null)
OraCommand.Parameters.Add(param);
// Execute the command
OraCommand.ExecuteNonQuery();
}
}
catch (OracleException err)
{
// handle exception
}
finally
{
OraConnction.Close();
}
}
private string GetConnString()
{
string host = System.Configuration.ConfigurationManager.AppSettings["host"].ToString();
string port = System.Configuration.ConfigurationManager.AppSettings["port"].ToString();
string serviceName = System.Configuration.ConfigurationManager.AppSettings["svcName"].ToString();
string schemaName = System.Configuration.ConfigurationManager.AppSettings["schemaName"].ToString();
string pword = System.Configuration.ConfigurationManager.AppSettings["pword"].ToString(); // hopefully encrypted
if (String.IsNullOrEmpty(host) || String.IsNullOrEmpty(port) || String.IsNullOrEmpty(serviceName) || String.IsNullOrEmpty(schemaName) || String.IsNullOrEmpty(pword))
{
return "Missing Param";
}
else
{
pword = decodePassword(pword); // decrypt here
return String.Format(
"Data Source=(DESCRIPTION =(ADDRESS = ( PROTOCOL = TCP)(HOST = {2})(PORT = {3}))(CONNECT_DATA =(SID = {4})));User Id={0};Password={1};",
user,
pword,
host,
port,
serviceName
);
}
}
And the datatype for the FILE_CONTENTS
column was BLOB
, the FILE_SIZE
was NUMBER(10,0)
, LAST_MODIFIED
was DATE
, and the rest were NVARCHAR2(250)
.
Yes, generally the best way to store a file in a database is to save the byte array in a BLOB column. You will probably want a couple of columns to additionally store the file's metadata such as name, extension, and so on.
It is not always a good idea to store files in the database - for instance, the database size will grow fast if you store files in it. But that all depends on your usage scenario.
While you can store files in this fashion, it has significant tradeoffs:
These are just some of the drawbacks I can come up with off the top of my head. For tiny projects it may be worth storing files in this fashion, but if you're designing enterprise-grade software I would strongly recommend against it.
It really depends on the database server.
For example, SQL Server 2008 supports a FILESTREAM
datatype for exactly this situation.
Other than that, if you use a MemoryStream
, it has a ToArray()
method that will convert to a byte[]
- this can be used for populating a varbinary
field..
What database are you using? normally you don't save files to a database but i think sql 2008 has support for it...
A file is binary data hence UTF 8 does not matter here..
UTF 8 matters when you try to convert a string to a byte array... not a file to byte array.
Confirming I was able to use the answer posted by MadBoy and edited by Otiel on both MS SQL Server 2012 and 2014 in addition to the versions previously listed using varbinary(MAX) columns.
If you are wondering why you cannot "Filestream" (noted in a separate answer) as a datatype in the SQL Server table designer or why you cannot set a column's datatype to "Filestream" using T-SQL, it is because FILESTREAM is a storage attribute of the varbinary(MAX) datatype. It is not a datatype on its own.
See these articles on setting up and enabling FILESTREAM on a database: https://msdn.microsoft.com/en-us/library/cc645923(v=sql.120).aspx
http://www.kodyaz.com/t-sql/default-filestream-filegroup-is-not-available-in-database.aspx
Once configured, a filestream enabled varbinary(max) column can be added as so:
ALTER TABLE TableName
ADD ColumnName varbinary(max) FILESTREAM NULL
GO
Source: Stackoverflow.com