Oracle search inside blob content

Here is a query to search for a string inside a blob

SELECT *
FROM my_schema.my_table t
WHERE dbms_lob.instr(t.my_blob,utl_raw.cast_to_raw('Searched string'))>0

8 comments to Oracle search inside blob content

  • Cris

    Hello
    I’m searching for possibilities to search inside blob columns so I found your post. So I tried this query. I inserted in my table a row having in the blob column a Microsoft Word document (.doc) having inside the string “este”, among other strings. Unfortunately when I try this query it doesn’t return this row. Do you have any idea why is this happening? Do you have any suggestions?
    Thanks

  • archana

    hello,
    code worked with word file. but pdf file, it worked only for few words.
    Could you please update me, how to search a paticular string within pdf file blob?
    Regards,
    archana

  • Works fine.

    As for Word and PDF files… well, Word may be in two-byte encoding (Unicode, UCS-2), and the query assumes the blob contains text in a single-byte one (e.g. ASCII).

    PDF, on the other hand, as a Postscript variant, may (AFAIK) split whole words into parts according to their formatting and position on the page. So the search may not work too.

    In short, not every application format can be searched as text. Try to run grep over your file — if it finds the text, so this query should

  • Hi,

    As vladimir said, it does not work with “encoded” files.

    Here is how my Panacotta receipt looks like in a word ’97 format:

          :      Verser un petit peu de coulis
    de fruits rougehEAƒ hEAƒ 5B*CJ PJ  \nH
    

    So sometimes you see text, but most of the time, the text is mixed with binary delimitors and complex word specific tags.
    If I convert it in pdf… here is what I got:

    stream
    xœ½]ÝŽ·‘¾ wè›ÎI2­&Ù¿€!À–ìÀ»Ö”Ý{/&£#YÀh$ÍŒ²Ù·
    Šßh/=ûË"‹l’Í*v÷‰G3‡d±XU$ëã×ÍS=û¾úâ‹gzñí˪yþ¼úêå‹
    êÓÓ'Jµ«FÿO´C=ô•šz¨D#릯®Þ?}òìÛ÷—oO]õòCõÃÓ'Õ×zQU
    0Â¾z¥«~#*¡êIV¯Þ< }"¬ØJe[
    J—T¯´Ä¦–BVM­¥výTݾÍ}úç?>}òãáûãE¸¼¹¹¬^|¸?^q¸¿< þ
    Wõê_Ÿ>ùZ÷	ýºž`“zúñPQU;Q7*ªªÈªS-¦uR{Y«XÇ‹îpy{¼
    P‡7ŸÃá=ŒçŽ E¯Eöµ°¾Œ¬/+‘7=HíÑöZ?m^í\íçV é“gË·‡z òðÙ
    ü~k~¯ôÏé0êÁüý£ùü¤ëH¨3H=À‹nÀ¶7¦­ýiëPÃýXS¨oaü*¿Ò㯧>g‚
    ~œêA 	ºFVoÝ/f¨ÿqT‡KðÑõIt„Aô‡J«< Ìß÷G!í‡ï޽þKû~ù k˜‘
    ‡Óõõ‰ÕÔÖR…*FÕ⨴êÝ?'=Eë©íŒ·z=O=¢¡­GUu²îƶÒ!*«ÛÓÓ
    'o~‡?•jþçïª]7è·›†5M­Ñu¥¬Áû	¢Ê\H˜Ùr„øjÝtO>´1Úåóÿ
    …uzÔRHí@0«¬åPÉI/Xm%ê~ôvp5ì TÝWCݶsqÛ¿¸b•”;3
    

    worst…

    Just try to open any PDF file with NOTEPAD and you will see how Oracle sees it.

  • Sourabh

    Hi,

    To search inside a BLOB column use the following query.
    It works fine. It is for Oracle.
    I have no idea if it works with other DB.

    SELECT column_name1 FROM table_name
    WHERE dbms_lob.INSTR(column_name2, utl_raw.cast_to_raw(‘pattern’),1,1) > 0

    In pattern, specify only the string u need to search
    dont specify % or any other special characters.

    Hope this helps you or whoever reads this.

  • Satheesh

    Awesome. It worked for me. Thanks for your help.

  • Mithun

    Hey, its working with .xml,.txt,.doc,.sql but with .pdf its not working
    anyone have solution for it n why with pdf its not working..
    plz reply asap if anyone have idea abt it…
    thnx……

  • Joanne Luck

    This was just what I needed to identify png image blobs.
    Thank you
    :o)

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

*