Author: Stefan Pettersson
How to detect if an ASCII textfile uses UNIX or Windows linebreaks?
This function will detect if a textfile is a Windows or UNIX textfile, and while
we're at it, let's show two versions of the same function, one beautful and one
less beautful.
Answer:
First of all, the reason is because Windows uses CRLF ($0D $0A or #13 #10) and
UNIX/Linux uses just LF ($0A or #10) as linebreaks in textfiles.
The need to do it is because when using the Readln procedure it will not work on
UNIX files because it cannot detect the linebreak. Instead of seeing your
application go crazy it might be a nice thing to detect if it's a UNIX file or not
in advance, and then provide the option to convert it if necessary.
The way to detect if it's a UNIX or Windows file is to spot the difference, i.e. to
see if a CR char precedes the LF char.
Here is a go at it:
1 function IsFileUNIX(Filename: string): boolean;
2 var
3 StopRead: boolean;
4 F: file of Byte;
5 CurB, PrevB: Byte;
6 begin
7 StopRead := False;
8 PrevB := 0;
9 Result := True;
10
11 AssignFile(F, Filename);
12 FileMode := 0; // read only
13 Reset(F);
14
15 while (not Eof(F)) and (StopRead = False) do
16 begin
17 read(F, CurB);
18
19 // check if $0D precedes $0A
20 if CurB = $0A then
21 begin
22 Result := PrevB <> $0D;
23 StopRead := True;
24 end;
25
26 PrevB := CurB;
27 end;
28 end;
Well, this function did what I wanted, however, I thought it looked kind of ugly so
I began to think a little bit how I may use the same principle, but execute it with
fewer statements and make the function a little bit more beautiful.
Simply replacing the while loop with a repeat loop did miracles, here's the second
go at it:
29 function IsFileUNIX2(Filename: string): boolean;
30 var
31 F: file of Byte;
32 CurB, PrevB: Byte;
33 begin
34 AssignFile(F, Filename);
35 FileMode := 0; // read only
36 Reset(F);
37
38 repeat
39 PrevB := CurB;
40 read(F, CurB);
41 until (CurB = $0A) or (Eof(F));
42
43 // check if $0D precedes $0A
44 Result := PrevB <> $0D;
45 end;
|