My fun with exporting data from Notes continues. I've finished constructing my "SQL insert script view" so that it now outputs a SQL script full of inserts which are compatiable with SubText's SQL database schema. The view basically contains a single computed column containing @Functions that build a single output string which looks suspiciously like a SQL insert statement. I went into my view, exported the data, and run the output file as a .SQL script. The script runs, however it needs a bit of manual massaging after the export due to a few extra escape characters that somehow managed to appear. The weird thing is that some rows are fine, and others aren't. I'm not sure why this is, but I've tried all of the export options and none of them presented me with output that was usable without a lot of data massaging.
In case you're interested, here's the code for the computed column. There are a couple of fields there (such as txtTechnoratiTags and FakeFileName) which are very specialised to the software that the source site runs (completely custom written), however you can see the basic idea.
utcDateTime := @Text(@Year(@Created)) + "-" + @Right("0" + @Text(@Month(@Created)); 2) + "-" + @Right("0" + @Text(@Day(@Created)); 2) + " " + @Right("0" + @Text(@Hour(@Created)); 2) + ":" + @Right("0" + @Text(@Minute(@Created)); 2) + ":00.000";techTags := @If(TechnoratiTags != ""; "<p>Technorati tags: " + txtTechnoratiTags + "</p>"; "");"insert into subtext_Content (Title, DateAdded, PostType, Author, Email, BlogId, DateUpdated, [Text], PostConfig, EntryName, DateSyndicated) VALUES(" +
"'" + @ReplaceSubstring( Title; "'"; "''" ) + "', " +
"'" + utcDateTime + "', " +
"1, " +
"'Ross'" + ", " +
"'ross@somewhere.com'" + ", " +
"0, " +
"'" + utcDateTime + "', " +
"'" + @ReplaceSubstring( @Implode(Body; @NewLine); "'"; "''" ) + techTags + "', " +
"93, " +
"'" + @ReplaceSubstring( FakeFileName; "'"; "" ) + "', " +
"'" + utcDateTime + "')"
Given how badly all of the basic exports were mangling my data, I decided to write a quick agent to export that column alone. I could have duplicated my @Function above in LotusScript, but decided that would be fairly pointless. I'm also trying to do this while spending as little time as possible using Notes, so the thought of re-doing my @Functions in Script simply seemed like a waste of time. So I wrote a quick and nasty agent to traverse view entries and simply spit out the contents of that one column alone. In it's quick, nasty, and full of hardcoded content form it looks like this:
Sub Initialize
Dim session As New NotesSession
Dim w As NotesUIWorkspace
Dim thisdb As NotesDatabase
Dim view As NotesView
Set thisdb = session.CurrentDatabase
Set view = thisdb.GetView("sqlexport")
Set vn = view.CreateViewNav
Set ve = vn.GetFirst
Dim fileHandle As Integer
fileHandle = Freefile
Open "c:\test.txt" For Output As #fileHandle
While Not (ve Is Nothing)
Print #fileHandle, ve.ColumnValues(3)
Set ve = vn.GetNext(ve)
Wend
Close #fileHandle
End Sub
The output of these 2 combined is SQL which I can run to populate an empty SubText database with all of the html (even if some of the img tags wont work) in my existing source database. It's a bit nasty, and a bit hacky, but it will do the job - however, how the hell is anyone going to do this if they're not a developer who's semi capable in both platforms?
While playing about with this I had thoughts of writing an application to perform this work en masse for other databases. The idea being you point the tool at a database, select a form, and enter some mapping information against the fields you wish to migrate (target field name, target data type, etc). The tool would then prompt you for a view containing your source documents, and run through generating a SQL script. I thought about this for all of 10 minutes before ruling it out due to the serious limitations it would have:
- The app wouldn't be able to handle Rich Text fields. Sure, I could create a text only abstract version and insert that, but all formatting, objects, and attachments would be lost. I suspect this would be unacceptable.
- For it to be useful against any semi-complex Notes applications, you'd really want to find a way to work on any child documents and restore the heirachy in the target database. This is more work than I was prepared to put in (and all MY data is flat!)
- The thought of doing more than 15 mins worth of coding at a time in Notes is a real turn off.
(Having said that, if there's high demand out there I could still be convinced, so please leave a comment if this is something you might have a use for.)
So where does this leave me? I have a way of getting historical text based data out of my source website database, and porting it into Subtext. The method is adaptable so that I could use it against other Notes databases, but it'd require a bit of developer tweaking here and there. It's not something which would work for Rich text data, and it's not the sort of process that someone who isn't fluent in SQL scripts as well as Notes/Domino development would want to be spending time with. My data will be liberated.
While doing all of this, I couldn't help but think of various conversations I've read in past years about 'vendor lock in'. During all of my years using Notes I'd never stopped to think about how proprietary Rich Text fields were. For a few years, I just used them. Around 1999/2000 I started trying to put rich text fields on the web, and started to see that there was a translation process there which didn't always go as smoothly as planned - that should have been all the warning I needed. It seems that wasn't the case. I can be a bit stubborn at times.
Every vendor wants to keep it's customers using their software. Why spend time developing products to help customers leave? It's simply not good business sense. Banks do it too - ever tried changing banks? All the things you'll need to update, the forms to fill out, the things you'll forget and need to update, it's going to take time and energy - and guess what? It's the same with software vendors. I know there's demand for this sort of thing out there. I've had a few emails from people looking for migration advice, and it's very hard to give at a generic level because it will vary depending on each persons individual situation as well as their level of knowledge in their source and destination platforms. However, I'm determined to complete this process for my data, and to make an honest attempt at documenting things along the way - hopefully that will come in the form of some articles in the future, or maybe even the odd application or two.