Ediscovery: navigating through the digital maze


By Richard Szabo | Friday, 23 January 2009
Font size:



Bookmark and Share ALB
Font size:
Click here to close


Latest comments


Concerns on MD5 Jered Floyd | 24/01/2009
[Apologies for the duplicates; newlines seem to be causing trouble with the submit button for me.]

MD5 is a fine algorithm to use to identify candidate matches for duplicate emails, but today should not be considered sufficient -- a full content compare should be done, or a stronger hash (such as SHA256) should be used. Does Allens do this?

MD5 is a 128-bit hash, which is a bit on the small side for very large quantities of data in terms of risk of hash collision (two files being identified as the same while really being different), however a larger concern is that MD5 has been programmatically broken. This means that a malicious person can relatively easily create two documents that intentionally have the same hash value, which is a serious problem for deduplication.

The SHA2 family of hashes do not have known attacks against them, and the risk of collision is extraordinarily small. I've written more about this topic in my blog, in the article at http://permabit.wordpress.com/2008/07/18/what-do-hash-collisions-really-mean/

Regards,
Jered Floyd
CTO, Permabit Technology Corp.
Report this comment | Hide
MD5 - I bet you! PM | 24/01/2009
Jered - Using SHA is a good idea, you are correct. However could YOU please make me a md5 collision for the content of this html page to demonstrate how it is done. I would love to see the two files with the same MD5.
Report this comment | Hide
MD5 - I bet you! Part 2 MD5 | 24/01/2009
Beating up on Md5 is a cheap shot imho. It serves no purpose but to confuse and bamboozle people. From my experience 99.9999% of the time it is the argument of people who could not make a Md5 collision. Let’s say we have 2 files. One is a image of illegal content, let’s say we get a MD5 collision and find a file on the suspects computer with the same MD5.

What is the chance that the no one is going to look at the 2 files content before the hammer falls.

Really guy - the chance of a collision on the smoking gun doc are so close to ZERO as to be absurd.

People please use SHA but don’t be fooled by the MD5 is broken run to the hills argument above.

I will take all this back if you post the file YOU created with the same MD5 as this HTML page. I will watch your blog for it. How long do you do you estimate I should check back in?
"The attack against MD5 just shows it's possible to generate some collisions when you can control both inputs and the hash. But does not allow you to generate a colliding input for a hash your given, even if you know the input which produced the hash.
http://www.infosec.sdu.edu.cn/paper/md5-attack.pdf"http://episteme.arstechnica.com/eve/forums/a/tpc/f/6330927813/m/543005743831
Report this comment | Hide
What lawyers really need to know Derek Begg | 27/01/2009
In reality lawyers don't have time to get up to speed on many e-discovery concepts, e.g. the MD5# debate here.
Report this comment | Hide
What lawyers really need to know - cont. Derek Begg | 27/01/2009
What's important to lawyers is understanding how e-discovery can help them do what they are doing right now - analyse and classify documents, and advise their clients on the issues arising. This article has lots of helpful concepts, but lawyers reading this will still need the answers to their burning issues: - what do I need to know and what can I leave to the experts? - what is this all going to cost, and will this threaten my profits? - how do I handle this if I don't have the time or the budget to work it out for myself?
Report this comment | Hide
Focus should be on the best outcome for the Client Lee Trevena | 25/03/2009
Whether a technology provider or legal counsel the objective of eDiscovery regulations in the US, UK and Australia is better Client outcomes - saving money and achieving justice would probably stand out at the top.

One of the sources of significant risk for Clients, and their partners, is not capturing all of the evidence (eg email transmissions). This weakens the legal position even if that evidence was a positive for the client. If this is attempted post-filing the risks and costs go up.

The other related issue is even if you have recovered all of the evidence, without an initial client-side 'high level filter the processing of email evidence will take too long and cost too much. The lawyers should simply be provided the relevant email, not the entire email backup for a company, which won't have records from 2003 anyway.

Also, I would expect your client's just want a technology that works when they need it and doesn't cost tens of thousands to maintain each year. The era of Technology as a cost centre is over.

Finally, in the eDiscovery age even the need to transport ESI is reducing. If counsel needs to review the email for a case simply provide them a secure login and they can discover from anywhere as the Net as no boundaries.

Better technology is making thins better for lawyers and their clients.
Report this comment | Hide

Leave your comment


Name *
Comment title
Comment *
Your IP address is 38.107.191.102 - this will not be published, but will be recorded in the event of a complaint about your posting.








Federal election: M&A lawyers unimpressed with hung parliament
masealake commented: Why a minority government good to Australia people? People will ...
In-house lawyers feeling the pinch as companies seek to save dollars
Nick Tiffen commented: One of the ways to be more cost effective is brief the bar direct ...
ANALYSIS: Major firm consolidation in SME space?
Ron commented: To add to Andrew's comments, other firms (especially in the UK) h ...
Corporates could halve their legal costs by outsourcing less
Ron Pol commented: The report illustrates a range of work being insourced and outsou ...
COMMENT: Another piece of the puzzle falls into place
suzettecharles commented: This is a wonderful opinion. The things mentioned are unanimous a ...


RSS news feed