Reply
 
Thread Tools Display Modes
  #1   Report Post  
Bill Crowl
 
Posts: n/a
Default How do I build a word list with number of occurrences of each wor.

I need to build a technical term list with definitions from a large Word
document. I figure if I could simply list all of the words in the document
and the number of times the word is used, I could esily cut out the common
words and concentrate on the tech terms. Does anyone know a way to do this
in Word?

Thanks,
Bill Crowl
  #2   Report Post  
Greg Maxey
 
Posts: n/a
Default

Bill,

This macro does a nice job of creating a word frequency list:

Sub WordFrequency()

Dim SingleWord As String 'Raw word pulled from doc
Const maxwords = 9000 'Maximum unique words allowed
Dim Words(maxwords) As String 'Array to hold unique words
Dim Freq(maxwords) As Integer 'Frequency counter for Unique Words
Dim WordNum As Integer 'Number of unique words
Dim ByFreq As Boolean 'Flag for sorting order
Dim ttlwds As Long 'Total words in the document
Dim Excludes As String 'Words to be excluded
Dim Found As Boolean 'Temporary flag
Dim j, k, l, Temp As Integer 'Temporary variables
Dim IngWordCount As Long 'Total non-excluded words in document
Dim NonWordObjects As Long
Dim AllWordOjects As Long
Dim TotalWords As Long
Dim tword As String '

'Set up excluded words
'Excludes = "[pickleloaf][gruntbutter]"
'Excludes = Excludes & InputBox$("The following words are excluded by
default: " & Excludes & ". Enter additional words that you wish to exclude,
surrounding each word with [ ].", "Excluded Words", "")
Excludes = InputBox$("Enter words that you wish to exclude. " _
& "Place each word within square brackets [ ]. " _
& "Example: [is][a].", "Excluded Words", "")

'Find out how to sort
ByFreq = True
Ans = InputBox$("Default sort order is word freqeuncy. To sort
alphabetically by word, type Word in the field below.", "Sort order",
"FREQ")
If Ans = "" Then End
If UCase(Ans) = "WORD" Then
ByFreq = False
End If

Selection.HomeKey Unit:=wdStory
System.Cursor = wdCursorWait
WordNum = 0
ttlwds = ActiveDocument.Words.Count
'AllWordObjects = ActiveDocument.Words.Count
'TotalWords = NonWordObjects

'Control the repeat
For Each aword In ActiveDocument.Words
SingleWord = Trim(LCase(aword))
If SingleWord "a" Or SingleWord "z" Then SingleWord = "" 'Out of range?
If SingleWord "a" Or SingleWord "z" Then NonWordObjects = NonWordObjects
+ 1
'SingleWord = Trim(aword)
'If SingleWord "A" Or SingleWord "z" Then SingleWord = "" 'Out of range?
If InStr(Excludes, "[" & SingleWord & "]") Then SingleWord = "" 'On exclude
list?
If Len(SingleWord) 0 Then
IngWordCount = IngWordCount + 1
Found = False
For j = 1 To WordNum
If Words(j) = SingleWord Then
Freq(j) = Freq(j) + 1
Found = True
Exit For
End If
Next j
If Not Found Then
WordNum = WordNum + 1
Words(WordNum) = SingleWord
Freq(WordNum) = 1
End If
If WordNum maxwords - 1 Then
j = MsgBox("The maximum array size has been exceeded. Increase
maxwords.", vbOKOnly)
Exit For
End If
End If
ttlwds = ttlwds - 1
StatusBar = "Remaining: " & ttlwds & " Unique: " & WordNum
Next aword

'Now sort it into word order
For j = 1 To WordNum - 1
k = j
For l = j + 1 To WordNum
If (Not ByFreq And Words(l) Words(k)) Or (ByFreq And Freq(l) Freq(k))
Then k = l
Next l
If k j Then
tword = Words(j)
Words(j) = Words(k)
Words(k) = tword
Temp = Freq(j)
Freq(j) = Freq(k)
Freq(k) = Temp
End If
StatusBar = "Sorting: " & WordNum - j
Next j

AllWordObjects = ActiveDocument.Words.Count
NonWordObjects = NonWordObjects
TotalWords = AllWordObjects - NonWordObjects

'Now write out the results
tmpName = ActiveDocument.AttachedTemplate.FullName
Documents.Add Template:=tmpName, NewTemplate:=False
Selection.ParagraphFormat.TabStops.ClearAll
With Selection
For j = 1 To WordNum
..TypeText Text:=Words(j) & vbTab & Trim(Str(Freq(j))) & vbCrLf
Next j
End With
ActiveDocument.Range.Select
Selection.ConvertToTable
Selection.Collapse wdCollapseStart
ActiveDocument.Tables(1).Rows.Add BeforeRow:=Selection.Rows(1)
ActiveDocument.Tables(1).Cell(1, 1).Range.InsertBefore "Unique Words"
ActiveDocument.Tables(1).Cell(1, 2).Range.InsertBefore "Number of
Occurrences"
ActiveDocument.Tables(1).Columns(2).Select
Selection.ParagraphFormat.Alignment = wdAlignParagraphRight
Selection.Collapse wdCollapseStart
ActiveDocument.Tables(1).Rows(1).Shading.Backgroun dPatternColor =
wdColorGray20
ActiveDocument.Tables(1).Columns(1).PreferredWidth = InchesToPoints(4.75)
ActiveDocument.Tables(1).Columns(2).PreferredWidth = InchesToPoints(1.9)

ActiveDocument.Tables(1).Rows.Add
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s(1).Rows.Count,
1).Range.InsertBefore "Summary"
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s(1).Rows.Count,
2).Range.InsertBefore "Total"
ActiveDocument.Tables(1).Rows(ActiveDocument.Table s(1).Rows.Count).Shading.BackgroundPatternColor
= wdColorGray20


ActiveDocument.Tables(1).Rows.Add
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s(1).Rows.Count,
1).Range.InsertBefore "Number of Unique Words in Document"
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s(1).Rows.Count,
2).Range.InsertBefore Trim(Str(WordNum))
ActiveDocument.Tables(1).Rows(ActiveDocument.Table s(1).Rows.Count).Shading.BackgroundPatternColor
= wdColorAutomatic

ActiveDocument.Tables(1).Rows.Add
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s(1).Rows.Count,
1).Range.InsertBefore "Number of Non-Excluded Words in Document"
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s(1).Rows.Count,
2).Range.InsertBefore (IngWordCount)

ActiveDocument.Tables(1).Rows.Add
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s(1).Rows.Count,
1).Range.InsertBefore "Number of Words (Excluded and Non-Excluded) in
Document"
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s(1).Rows.Count,
2).Range.InsertBefore (TotalWords)
System.Cursor = wdCursorNormal

MsgBox "This document contains " & Trim(Str(WordNum)) & " unique words. "
MsgBox "This document contains " & IngWordCount & " non-excluded words. "
MsgBox "This document contains a total of " & TotalWords & " (excluded and
non-excluded) words. "
MsgBox "For more statistics on this document, use ToolsWord Count in the
original document. "

Selection.HomeKey wdStory

End Sub

--
Greg Maxey/Word MVP
See:
http://gregmaxey.mvps.org/word_tips.htm
For some helpful tips using Word.

Bill Crowl wrote:
I need to build a technical term list with definitions from a large
Word document. I figure if I could simply list all of the words in
the document and the number of times the word is used, I could esily
cut out the common words and concentrate on the tech terms. Does
anyone know a way to do this in Word?

Thanks,
Bill Crowl



  #3   Report Post  
Bill Crowl
 
Posts: n/a
Default



"Bill Crowl" wrote:

I need to build a technical term list with definitions from a large Word
document. I figure if I could simply list all of the words in the document
and the number of times the word is used, I could esily cut out the common
words and concentrate on the tech terms. Does anyone know a way to do this
in Word?

Thanks,
Bill Crowl


Thank you. I'll try it.
Bill
Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Reveal Formating Code Dick Microsoft Word Help 5 March 12th 05 09:51 PM
How can I keep the PDFMaker toolbar from appearing every time I o. duvalte Microsoft Word Help 20 February 7th 05 07:13 AM
Maximum number of pages in a word document Travis75 Formatting Long Documents 5 January 27th 05 05:03 AM
How do I create & merge specific data base & master documents? maggiev New Users 2 January 12th 05 11:30 PM
How do I convert a cd in word perfect to microsoft word greylady Microsoft Word Help 1 November 23rd 04 07:03 PM


All times are GMT +1. The time now is 03:34 AM.

Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright 2004-2024 Microsoft Office Word Forum - WordBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Word"