The Joys Of Programming

PDFCreator will export from Word (and I assume other programs as well, though I've only used the Word functionality) as a PDF.

Bonus_Eruptus wrote:
RolandofGilead wrote:

Is there any program to let me write a .pdf the way one would use Word to write a .doc?

I'm pretty sure Word 2007 and up have PDF in the Save As options.

I think the export as PDF plugin isn't part of the default install, so if you can't see the option you can add it from the install discs.

I have used CutePDF Writer, a virtual printer, as my PDF creating solution.

Cyranix wrote:

I have used CutePDF Writer, a virtual printer, as my PDF creating solution.

I second that solution. lets you create a PDF out of anything.

Any web gurus around? A tech request I responded to turned into volunteering to help out a small local organization with their website (rather botched by the people they paid to make it initially). I almost have it where they want it except for a few small formatting issues and that the contact form still isn't working; returns 'Error!' on clicking submit without actually submitting info to them.
*edit: I suspect this error is from the contact.php but I'm not familiar enough with php syntax to decipher why it's generating it.

I'm not a web guy and I didn't write the initial page so I'm a bit lost on fixing these last issues, any help would be appreciated. Feel free to pm.

The highest priority issue is getting the contact form working.
Contact page

Lower priority is minor formatting issues on the home page
1) The vertical gap between "Groups of Seniors" and "While Mother " is too large.
2) The vertical gap between "activities that Halifax has to offer!" and the horizontal line following it is too large.
3) The two horizontal lines at the bottom under "event or holiday" should not be there.
Every time I managed to fix any of these it would throw the formatting off on something else.

If anyone can help with these I'd be happy to send along a token amount of steam items or paypal.

krev82 wrote:

Any web gurus around? A tech request I responded to turned into volunteering to help out a small local organization with their website (rather botched by the people they paid to make it initially). I almost have it where they want it except for a few small formatting issues and that the contact form still isn't working; returns 'Error!' on clicking submit without actually submitting info to them.
*edit: I suspect this error is from the contact.php but I'm not familiar enough with php syntax to decipher why it's generating it.

Looking at the if block where it prints "Error!" it's probably an error with the actual mailing itself. Probably no mail server setup on the machine, or the web server doesn't have access to the system's mail program. Try googling linux command line e-mailing and see if you can send out an e-mail from that machine's command line. Then take a look at the documentation on php's @mail command what it actually does and if you can debug it or trace it making verbose or something.

I should have mentioned, I want to create PDFs because Word sucks.

krev82 wrote:

*edit: I suspect this error is from the contact.php but I'm not familiar enough with php syntax to decipher why it's generating it.

The @ symbol in front of the mail function suppresses any error messages it might be generating. Remove that one character and then see if it gives you an useful error message. Also refer to the PHP doc on mail().

Edit: also change line 2 to error_reporting(1);
Edit2: What in the world is line 1 there for? Does the domain fpert.qpoe.com mean anything to you?

Pro tip: visit php.net/X where X is the name of a built-in function to quickly look up the docs for it. (i.e. php.net/eregi)

Lower priority is minor formatting issues on the home page
1) The vertical gap between "Groups of Seniors" and "While Mother " is too large.
2) The vertical gap between "activities that Halifax has to offer!" and the horizontal line following it is too large.
3) The two horizontal lines at the bottom under "event or holiday" should not be there.
Every time I managed to fix any of these it would throw the formatting off on something else.

If anyone can help with these I'd be happy to send along a token amount of steam items or paypal.

Gah! in-line sytles - It burnsss ussss.

Ok, I fixed your 3 issues, as I locally downloaded the page my version will have scrambled the image paths so you'll have to diff my version against yours and merge the changes
html page:
http://pastebin.com/ftCJfkz5
new css:
http://pastebin.com/EGcLsa9T

1) Fixed this by collapsing all the text in to a single paragraph and using 2 new divs to format the text. Rather clunkily I've added the spacing with a pair of br tags but you could/should inline a div that's 2em high instead.
2) The problem seemed to be weird placing of the h2 tags. I fixed that and removed the inline styles that the rulers were using and sent them to their own css class called .line
3) I think those 2 lines were/are the continuation of the previous 2 rulers right from the top of this section above the list, which was occurring because the h2 and list tags were weirdly nested and there was a missing ul pair. I fixed that which got rid of 1 line and re-formatted the list correctly to get rid of the other.

DanB wrote:

fixes

Awesome! Thanks so much!

Thanks for the mail info everyone else!, I'll sift through it and godaddy's mail peculiarities.

krev82 wrote:

godaddy hosting

I'm so, so sorry.

krev82 wrote:
DanB wrote:

fixes

Awesome! Thanks so much!

No worries. To be honest the html & css in that page is pretty cruddy, if I'd had the time and my hangover weren't quite so bad I would have rebuilt the page from scratch, possibly worth doing that anyway if you have the time/desire

momgamer wrote:

Yeah - this is why I mostly prototype in Photoshop.

Prototype in PowerPoint!

DanB wrote:
krev82 wrote:
DanB wrote:

fixes

Awesome! Thanks so much!

No worries. To be honest the html & css in that page is pretty cruddy, if I'd had the time and my hangover weren't quite so bad I would have rebuilt the page from scratch, possibly worth doing that anyway if you have the time/desire

Also, on my uber monitor at hi resolution I notice the top banner background repeats rather... poorly.

Our app is being integrated with our new parent company's app. We launch their app from ours, but it was decided that for an upcoming tradeshow, their app's name would not be displayed, so as to not confuse. So they swapped in new images, updated some text, and changed the color scheme to more closely align with ours.

The images they sent us are terrible. Absolutely terrible. I knew they didn't really get good UI design, but this goes far beyond that. One is a JPG, the other a BMP. Both images have gradient backgrounds, in colors that don't match the scheme. Both have a white line of pixels running along the bottom for no reason. The JPG file is so horrendously overcompressed that it hurts to look at it. Also, the JPG file is native 275x90, but is being displayed at 288x134 so it is horribly stretched.

Edit: Ugh, I was only looking in one folder. There are 6 other terrible images in another folder. Some are super-low color over-dithered GIF files.

Quintin_Stone wrote:

Edit: Ugh, I was only looking in one folder. There are 6 other terrible images in another folder. Some are super-low color over-dithered GIF files.

It's always nice when further research leads to clarity. Unfortunately, sometimes what is made clear is that THIS IS BULLsh*t. Still, nice to know that for sure.

Also, saw this in the video thread.

Ugh.

Date is stored in the DB as a string (don't ask) mm/dd/yyyy format.

The SQL query from the server converts it to a datetime (with the "time" as 12:00 AM). SQL Server 2005 datetime types supposedly contain no time zone info. C# implicitly converts that to a .NET DateTime class with a Kind (timezone ref) of "unspecified".

Supposedly time zones are taken into account when serializing the DateTime to the client from the server. (This is a function call via .NET remoting, so that serialization is behind-the-scenes.) But for unspecified?

So the gist of the problem is that one user is seeing this date "4/3/1953" as "4/2/1953 11:00 PM" on the client app when we bind a cell to it in a DataGrid. But when we run the client on the same machine as the server, we see it as "4/3/1953 12:00 AM", which is what we expect.

Time zone or DST problem, right? You'd think so, except client and server machines are both in EST. So, like, WTF?

QStone - You might have already accounted for this, but is the one affected user on a box or dependent on a box that doesn't have the post-2007 DST window patched in? It sounds like you're already past that stage but it's something to double check.

From what I remember, client and server both ran Windows 7.

I've changed the code so that the displayed info is just the raw date string from the DB. The reason it was being converted was entirely for sorting reasons. But I can sort on a converted datetime without having it as a displayed column.

Why can't you write a program that adds a field as a time stamp and fix all the data?

Edit: Double post

If it was just one column, it'd be easy. But the columns are all dynamically created according to customer needs, and the whole system was written back in the day to store it all as text. And each customer may have dozens of date or datetime fields in their system.

I think I'm going with Nitro Pro 8 for pdf.
Why is this relevant? Look at the link for the OCR features, there are a lot of old computer science papers that are just scanned images.

I know I read an article somewhere about why pdf and .doc are the way they are, but I cannot imagine or have forgotten why it makes sense to have a .doc (or an rtf for that matter) to print out with one page on one machine and two pages on another nor can I forgive it. f*ck that sh*t.

WordPerfect 5.1 forever! Keyboard commands rule!

edit: Wow, it's $120, but MS Word 2013 is $110. I would totally pay $10 to never have to use Word again.

RolandofGilead wrote:

I know I read an article somewhere about why pdf and .doc are the way they are, but I cannot imagine or have forgotten why it makes sense to have a .doc (or an rtf for that matter) to print out with one page on one machine and two pages on another nor can I forgive it. f*ck that sh*t.

The main culprit here is fonts. Inside a Word document, each block of text - which can be anything from a paragraph up to the whole document - is tagged with a font name, and the layout engine handles line breaks and page breaks based on how much space the text takes up. The problems start when the printer font doesn't exactly match the one on your PC. If a typical letter in Microsoft's implementation of Times Roman takes up, say, 10x20 pixels, while the printer uses a slightly different implementation of Times Roman where it takes up 11x21 pixels, the lines are going to be broken up differently (because each word is taking up slightly more space), and if this adds a bit more height to enough paragraphs, the text spills over onto a new page.

This doesn't happen (normally) with PDF because PDF is designed to specify the exact layout of a page, which Word really doesn't do. Inside a PDF document, instead of just tagging a big block of text with a font and letting the display engine do the layout, the text is broken up into much shorter blocks (typically a few words, but often as small as individual characters), and the exact position of each block on the page is specified. You can still get the situation where the printer uses a slightly different font (it's rarer though, because PDF generators embed fonts in documents more often than Word does), but because a PDF doesn't re-flow when the font changes, the worst that can happen is a slightly ragged right margin.

(Technically, PDF can do automatic re-flowing like Word, but documents that use this feature are unusual; PDF generating software normally does the lower level detailed layout I described above.)

3) I know I said print, but it actually prints out exactly as it shows up on the screen. What I don't get is how two copies of the same version of Word on two different computers can turn out to be one page on one and two on another.

CaptainCrowbar wrote:
RolandofGilead wrote:

I know I read an article somewhere about why pdf and .doc are the way they are, but I cannot imagine or have forgotten why it makes sense to have a .doc (or an rtf for that matter) to print out with one page on one machine and two pages on another nor can I forgive it. f*ck that sh*t.

The main culprit here is fonts. Inside a Word document, each block of text - which can be anything from a paragraph up to the whole document - is tagged with a font name, and the layout engine handles line breaks and page breaks based on how much space the text takes up. The problems start when the printer font doesn't exactly match the one on your PC. If a typical letter in Microsoft's implementation of Times Roman takes up, say, 10x20 pixels, while the printer uses a slightly different implementation of Times Roman where it takes up 11x21 pixels, the lines are going to be broken up differently (because each word is taking up slightly more space), and if this adds a bit more height to enough paragraphs, the text spills over onto a new page.

Holy crap that is messed up, 1) how can they both be the same font but be different?
2) We still print character-by-character? I just assumed that the computer computed each individual dot that it wanted to print these days. How else could it do images within something like a .doc?

What I don't get is how two copies of the same version of Word on two different computers can turn out to be one page on one and two on another.

Could be something related to margins. If the default paper size isn't exactly identical (for instance, if the two Word installations don't have exactly the same printer) then you might end up with something like that.

Holy crap that is messed up, 1) how can they both be the same font but be different?
2) We still print character-by-character? I just assumed that the computer computed each individual dot that it wanted to print these days. How else could it do images within something like a .doc?

Printers use page description languages, either PCL or Postscript, usually. The print driver sets the the font, possibly downloading a bitmap representation of it, if the printer doesn't have a native version, and then sends text as text. Both page languages also have the ability to accept an arbitrary bitmap at an arbitrary page position, which allows pictures to be printed along with text.

Some print engines, like Ghostscript on Linux, use this mode to pre-render all the fonts internally, generating a picture of the page, which is shaped to fit exactly in the allowable paper margins, and then printing that picture. But this is hundreds of times larger than sending text plus a font description, and on most printers, for a long time, this was much slower than having the printer print text. It probably still is, on an inkjet; a good laser printer would probably be about the same speed either way.

RolandofGilead wrote:

3) I know I said print, but it actually prints out exactly as it shows up on the screen. What I don't get is how two copies of the same version of Word on two different computers can turn out to be one page on one and two on another.

Same version of Word doesn't necessarily mean the same set of fonts, unfortunately. Installing other software - such as printer drivers - will sometimes overwrite existing fonts with a slightly different version of the same font from the new installer. Of course software vendors should be careful not to replace anything that might break existing installations, but vendors do a lot of things they're not supposed to do. (How often have you had old software break when new software replaces a DLL with a new version that's supposed to be backward compatible but isn't quite?)

RolandofGilead wrote:

Holy crap that is messed up, 1) how can they both be the same font but be different?

Widely popular fonts - the likes of Times Roman, Courier, Helvetica, etc - are old designs that mostly predate digital typography. Many of them have been digitised by multiple font vendors; some of the classic fonts are in the public domain now, others have been licensed by multiple vendors. The result is that lots of fonts, especially the well known, widely used ones, exist in multiple versions that are almost but not quite identical. Microsoft licenses Vendor A's implementation of Times Roman for Office, HP licenses Vendor B's Times Roman for its Deskjet drivers.

RolandofGilead wrote:

2) We still print character-by-character? I just assumed that the computer computed each individual dot that it wanted to print these days. How else could it do images within something like a .doc?

All document formats have some way of embedding images as well as text, usually allowing both bitmap and vector images. There was a period (around the late 1990s) when doing all the rendering on the client PC and sending the resulting bitmap to the printer was popular, and a few printers still work that way. But these days printers have respectable CPUs of their own, and the main bottleneck in the printing process is the time it takes to send the document across the network to the printer. So it's better to send the PDF (or Postscript or PCL or whatever format the printer speaks), and let the printer do the rendering, because the PDF/PS/etc is much smaller than the same document would be as a bitmap image.

And as for why PDF files laying things out character by character happens even for screen stuff: to get [em]really[/em] good typography, you sometimes have to do a lot of work to adjust character spacing. Certain letters kern together differently than other letters, so you might have rules about how two "t"s next to each other is different from "ti" and such in the font. Those automatic rules work reasonably well, but to do really well requires a fair amount of effort, so publishing programs that care about how things look might go through extra effort to do it well. It's not super important for body text, but for headlines and logotypes, where the text is short and presented very large, it can make a huge difference. So, there's still a lot of room for adjusting the spacing of individual letters in words to get the best effect. (Consider also that there are different layout rules for different contexts--if you're laying out mathematical equations, you need to place things differently from laying out running text. If you do that layout up front, then the viewer application doesn't need to understand the special rules. If you do it in the viewer, then every viewer has to know every possible set of rules.)

At the same time, for efficiency's sake, you don't want to encode things as a drawing--the letter "a" mostly looks the same every time, so why not make use of that? So you send information about how to draw a letter "a" and then later say "put an 'a' here". Hence: fonts. And, partially because font makers are protective of their stuff, fonts can't always be embedded in the output, so you have extra work to do there.

--

Unrelated: I came across a blog post today, Why functional code is shorter, with some interesting thoughts about how the expressive power visible in functional languages is mostly about how they make it easy to compose larger programs from smaller pieces, and that this property doesn't require a purely functional language--it just requires some discipline about maintaining reasonable referential transparency. (i.e. even if you do things involving state, that's fine--as long as the consumer of your API is insulated from the state manipulations.)

Malor wrote:
What I don't get is how two copies of the same version of Word on two different computers can turn out to be one page on one and two on another.

Could be something related to margins. If the default paper size isn't exactly identical (for instance, if the two Word installations don't have exactly the same printer) then you might end up with something like that.

Holy crap that is messed up, 1) how can they both be the same font but be different?
2) We still print character-by-character? I just assumed that the computer computed each individual dot that it wanted to print these days. How else could it do images within something like a .doc?

Printers use page description languages, either PCL or Postscript, usually. The print driver sets the the font, possibly downloading a bitmap representation of it, if the printer doesn't have a native version, and then sends text as text. Both page languages also have the ability to accept an arbitrary bitmap at an arbitrary page position, which allows pictures to be printed along with text.

Some print engines, like Ghostscript on Linux, use this mode to pre-render all the fonts internally, generating a picture of the page, which is shaped to fit exactly in the allowable paper margins, and then printing that picture. But this is hundreds of times larger than sending text plus a font description, and on most printers, for a long time, this was much slower than having the printer print text. It probably still is, on an inkjet; a good laser printer would probably be about the same speed either way.

Since PCL is a big part of my job, I'll shine some more light on how Word does PCL printing. For TrueType fonts, it embeds a description of how to draw each character ("glyph") as a series of shapes and kerning, rather than a bitmap. This way, it can scale smoothly to any size. It embeds the font info because there are maddening minute differences between a font in Windows and the font by the same name that might be installed on the printer. Word analyzes sections of the document as it creates the print job, embedding glyph descriptions only for characters that are being used.

Having told the printer how to draw each character, it can send whole blocks of text and only specify where to draw the first character.

An interesting thing I discovered was how WordPad prints a Word or RTF document to PCL. It does not embed font info. Instead, it uses the printer's font and breaks apart text into small clusters of about 1 to 5 characters and gives hard X,Y coords for each cluster. By doing this it overcomes the biggest problem with font differences, which is the character width and kerning discrepancy that is cumulative as you print glyph after glyph.

If the same document is not printing the same from 2 different computers, I'd look at the driver settings for slight differences. Is the doc being printed from the 2 machines to the same printer? Hold the two print-outs overlapping to a bright light. How closely do the characters line up? Do the margins match? Do paragraphs break on the same words?

Gah... I have to upgrade an extensive Rails 2.2.x application to rails 3.2.x

So far it looks like the easiest method will just be to start a whole new Rails 3 application and move the code over piece by piece.