stepcode/src/express/Changes at change_python_package_name · stepcode/stepcode

History

568 lines (448 loc) · 22.9 KB

Raw

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

***********************************

Changes to the NIST Express toolkit

***********************************

Don Libes, libes@cme.nist.gov

Last revised: 19-Aug-1992

POOP adj. (Acronym for Post-OOP) A paradigm (q.v.) long

awaited by many. Also, reminiscent of the sound made by

the collapse of an overinflated balloon.

OVERVIEW OF CHANGES

The bad news is: Much has changed. You will not be able to recompile

applications without changing them.

The good news is: The system is faster. Much faster. And the library

is based on the Express DIS, and implements everything needed to do

full resolution of all features of Express.

Until formal documentation is written, you will have to look at the

code. The good news is that the code is much much shorter and

cleaner. The bad news is that I left in some of the original code as

comments, so you may be distracted by this.

I have converted over two pieces of programs that depend on the

library. exp2cxx (in ~pdevel/src/fexp2cxx) and the step parser

(in ~pdevel/src/fstep). Since I didn't write either one originally, I

don't take credit for the overall readability, but they at least

provide proof that the library functions.

Here is an overview of what's changed.

- The overall structure has been changed to allow easier interfacing

and more customization. Even sophisticated applications can use the

default main now. To use the default main, define EXPRESSinit_init

as:

void EXPRESSinit_init() {

EXPRESSbackend = your-backend-function-goes-here;

}

Other hooks can be found by looking at the true definition of main.

- The OO system is gone. Everything is pointers to real structures

rather than "objects". This is what accounts for much of the speed

improvement. Debugging is easier, too, since you no longer have to

rely on functions to print out structures.

The downside is that some of structures have embedded unions. This

can be confusing at first, but at least the compiler and debuggers can

now understand what you are doing and help you out.

- Almost all of the functions in the old library are unnecessary in

the new one since you can access structure elements yourself now.

Nonetheless, for compatibility, I have defined replacements for the

most likely used functions. If you have a function with no

definition, either there is no counterpart, I didn't think anyone

actually used that function, or I just haven't gotten around to

writing it.

- The functions most likely to counterpart-less are some of the:

schema functions - the definition of a schema changed quite a

bit due to USE/REF and nested schemas changing)

type functions - types don't resemble those in the old

library. See more info below.

expression functions - expression don't resemble those in the

old library. See more info below.

- Error processing has been speeded up. The error messages are

greatly improved (no more overloading of a single error message for

different situations), more descriptive and much (much, much) more

error checking is done. And files are tracked now along with line

numbers for all objects.

Some specific notes can be found below.

GETTING A COPY OF FEDEX AND THE LIBRARY

**************************

Getting a precompiled copy

**************************

The fedex executable and library can be found in ~pdevel/bin and

~pdevel/arch/lib respectively. They will be regularly updated by me

as bugs are fixed. So make a copy if you want a static version.

**************************

Getting the source

**************************

To retrieve the source, link to the RCS directory, check out the

CheckOut file, and then run CheckOut itself. "make" by itself will

build an executable while "make libexpress.a" will build the library.

Here are real commands to do this:

mkdir -p ~/pdevel/src/fexpress2

ln -s ~pdevel/src/fexpress2/RCS ~/pdevel/src/fexpress2

co CheckOut

CheckOut

make

Incidentally, the name 'fexpress2' is temporary while this release is

being tested. Eventually, we will give it a better, more permanent

name.

**************************

'Libmisc' is dead, but ...

**************************

Note that the 'libmisc' library is no longer necessary. (It has been

integrated directly into the express library.) However, you still

need the the 'usual' tools in pdevel/bin and the 'usual' other

libraries in ~pdevel/arch/lib. You can change the targets in either

Makefile or make_rules as appropriate. The express directory has its

own make_rules for simplicity.

**************************

Documentation

**************************

There is none. Ok, just kidding. What there is, is a file called

Changes which you'll get from CheckOut, describing the changes from

the old version to the new version.

It is very rough. There is little consistency, although I tried for

completeness. (It's 22K.) Nonetheless, it is still an overview and

skimps on precise details of many calls. Really, it's just there to

jog my memory when I write the real documentation, or for experts

(like you) who don't want to wait for the documentation.

MISCELLANEOUS NOTES

The following are miscellaneous notes that you may find helpful -

especially because there is no other documentation. (Sorry.)

Numerous elements in the language are now resolved including:

ALIAS, RULE, QUERY

It is interesting to note that there was formerly no way to even

represent them because the libmisc package had no means to do multiple

inheritance. Steve and I talked about implementing multiple

inheritance but were convinced that it would drastically slow down

every other part of the system. This seemed a poor tradeoff

considering that we only needed inheritance from at most two

orthogonal classes.

Enumerations are now separated into different scopes. For the same

reason as above, this was formerly impossible.

======================================================================

Class x; -> Class_of_what x;

i.e.,

Class_of_Type x;

Similarly, OBJget_class is now specific to whatever class you are using.

I.e.,

OBJget_class(type) -> TYPEget_type(type)

if (class == Class_Aggregate_Type) -> if (TYPEis_aggregate(class))

Rationale: underlying type system changed completely. Class/object

system gone, but efficiently faked. Can no longer call object type

'Class'.

======================================================================

Some people assumed many functions returned const values. Many

functions did in fact return such values. Now they do not.

Rational: Most functions are now macros, returning pointers right out

of the data structures. Since these are the real objects, they are

writable.

======================================================================

Most objects returned from functions do not have to be OBJfree'd.

You will have to look at the documentation to see which ones. Thus,

OBJfree has been turned into a no-op.

Rational: Most functions now return pointers right out of the data

structures. Freeing them would corrupts the system.

If you are getting a list, call the appropriate data structure

function to free it. I.e., SCOPEget_entities_use returns a list, you

should call LISTfree to free it.

======================================================================

SCOPEget_entities_supertype_order now no longer returns USEd entities.

Use SCHEMAget_entities_use and SCHEMAget_entities_ref to get either of

these.

Rationale: At KC's request. This decision might be revisited.

Perhaps another function could be added.

======================================================================

ENUM_TYPEget_items now returns a dictionary instead of a list. Each

element is an expression of type 'enumeration' instead of a symbol.

Rationale: Efficiency.

======================================================================

DYNA_init is dead and gone. Remove all such calls.

Rationale: Hopeless nonportable and ultimately of little value.

======================================================================

The original pass1/pass2 idea has been revamped. "pass1" is now

referred to as "parse" (since that's what it is). "pass2" is referred

to as "resolve" (since that's what it is). The resolve pass actually

consists of several (currently 5) passes. The current pass number is

stored in EXPRESSpass. This number is really only useful for

debugging purposes.

EXPRESSparse prefers to open the file itself. Either call it as

EXPRESSparse(model,(FILE *)0,"filename");

or EXPRESSparse(model,filepointer,(char *)0);

EXPRESSparse takes a "model" argument that can be a new or old express

abstraction. This allows you to call EXPRESSparse repeatedly to read

additional schemas in to an old set.

To create a new express model, call EXPRESScreate().

To resolve an express model, call EXPRESSresolve(model).

======================================================================

The STRING abstraction has been removed. You should use the Standard

C library calls to deal with strings. I've left a couple macros in

place to aid in conversion, but these may go away in the future.

Rationale: The STRING abstraction allowed different underlying

representations for strings, but was incomplete to the point that

users had to assume that the standard C representation was used.

It was pointless to complete it, since the Standard C library is now

very rich in string support. The result would have just been

confusing.

======================================================================

A number of facilities are provided for referencing objects outside

the current file.

1) It is possible to logically insert other files during analysis by

use of an INCLUDE statement. INCLUDE statements were, at one time,

valid Express. However, they are not currently. It is best to think

of them as a preprocessing phase of the implementation that has

nothing to do with the language proper.

(With that in mind...) INCLUDE statements can appear outside a schema

or at the top-level of a schema. Included files are not restricted to

including schemas, but may include, for example, a set of entities, a

rule, etc. For example:

INCLUDE 'schema-file.exp';

2) Referencing a schema that is not defined in the file (or included

from another file) causes fedex to search for a file with the same

name as the schema with a ".exp" extension in the directories named by

the environment variable EXPRESS_PATH. For example, in the C-shell,

you could say:

setenv EXPRESS_PATH "~pdes/data/part42 ~pdes/data/part202"

In order to facilitate this, I recommend that all schema files have

symbolic links created to them by the names of any schemas within that

are likely to be externally referenced from them. Stable schemas may

have symbolic links placed in a directory of stable part files, while

unstable schemas should be referenced from a specific part directory.

For example, imagine that the directory for stable schemas is

~pdes/data/schemas/standard while, part 202 is still undergoing

evolution. In this case, the appropriate command might be:

setenv EXPRESS_PATH "~pdes/schemas/part42 \

~pdes/schemas/standard"

If not set, the default path of "." (the current directory) is used.

======================================================================

The old "warning" kludgery is gone. It has been replaced by several

routines in the ERROR package including

ERRORcreate_option

ERRORset_option

ERRORset_all_options

To associate an option string with a particular error, call

ERRORcreate_option.

ERRORcreate_option("subtypes",ERROR_missing_subtype);

To actually set or unset an option, it suffices to say:

ERRORset_option(sc_optarg,set);

where set is a true/false value. This is especially convenient with

getopt, since you can use the same code to set or unset an option just

by testing the option letter inside of the 'set' argument. I.e.

ERRORset_option(sc_optarg,c == 'w');

To print all the options out, say:

LISTdo(ERRORoptions, opt, Error_Option *)

fprintf(stderr,"%s\n",opt->name);

LISTod

======================================================================

Fedex has been changed to print errors immediately rather than

buffering them up and sorting them by line number. The underlying

function to toggle this is defined as follows:

ERRORbuffer_messages(boolean);

While the buffering code has been speeded up (it used to call two

extra processes, now it doesn't call any), I see little point to

sorting by line numbers. The order in which diagnostics are presented

to the user are the order in which problems should be resolved. I.e.,

a missing schema will be detected immediately, and will cause many

spurious errors.

======================================================================

The error routines have been beefed up in other ways as well,

especially for robustness. For example, if an internal or operating

system error occurs, a strong attempt is made to produce all previous

diagnostics, rather then just dumping core.

The main entry for reporting errors was changed from

ERRORreport_with_line to ERRORreport_with_symbol.

ERRORreport_with_line still exists for programs that don't know

anything about symbols (in which case, we guess at the information).

Rational: This was a necessary change in order to provide diagnostics

with filenames. The symbol abstraction itself also had to be

augmented with filenames.

======================================================================

The error messages are formatted a little differently so that the

default Emacs compile bindings can automatically read in and position

the appropriate Express file and display the error at the same time.

As an aside, Jim Wachholz has built an Express mode for Emacs.

Contact him for more info.

======================================================================

I have backed off on the original code's attempt at significant

information hiding. In particular, while some of the hiding worked,

some didn't. For example, users had to know whether information was

returned as a list or a dictionary. In fact, it is possible to hide

this as well - I don't know why Steve didn't bother, except that he

was tired.

For example, instead of a single LISTadd_last routine, there would have to

different LISTadd_last routines for every class. This would have improved

typechecking.

The new code is more efficient for a variety of reasons. The original

code paid a heavy price in efficiency for dynamic typechecking, and

using individuals function to access each data element in a structure.

The new code allows direct access. There is necessarily some dynamic

typechecking left in the system, but it quite small. The number of

switch statements is surprisingly small (less than two dozen).

The new code simulates the class hierarchy used by the old code in

spirit. In reality, the class hierarchy has been compressed from 5

levels to 2. The resulting code is much, much faster.

The key notions in the new system are:

a handful of base classes

dictionaries understand classes

Instead of objects being self-descriptive, context is used. The

dictionary is one such example. When you store an object, you

describe it to the dictionary. Upon later retrieval, you get the

object and the description back. When the object is not in the

dictionary, there is no descriptor. Your code takes over the job of

remembering what something is. Invariably, this very straightforward.

I.e., you might keep a list of entities, in which case you are

guaranteed all the elements on the list are entities.

A small number of operations can be performed on all classes. For

example, it is possible to get the printable description of a class by

saying:

OBJget_type(type)

All OBJ functions are implemented by single-table lookups.

Mnemonically-suggestive characters are used as indices into the OBJ

table.

======================================================================

Notes on fedex arguments:

b flag (buffering) - Now "off" by default. fedex reports the

most important error messages first. The idea of

messages appearing in the order of line numbers has

little value, especially in the context of multiple

input files.

r flag (no resolve) - Skip resolve pass.

p flag (print pass info) - This takes a string argument

object types to print out while being processed.

Valid object types are:

p procedure

r rule

f function

e entity

t type

s schema or file

# pass #

E everything (all of the above)

For example, the following prints out entity and rule

names as they are being processed:

fedex -p er

======================================================================

While some ALGxxx functions (macros, really) still exist, some have

been replaced by ones specific to the type of algorithm. For example,

ALGget_parameters should be changed to FUNCget_parameters,

RULEget_parameters, or PROCget_parameters.

======================================================================

The whole idea of passes has been revamped. The old pass2 (now called

resolve) is no longer monolithic but is broken into several more

passes. The old pass2 did a depth-first resolution over the object

tree. Besides requiring a very deep stack, it forced on-demand

resolution which was extremely painful - everything had to constantly

check whether things had been resolved or whether there was infinite

recursion (due to USE/REF).

It was possible to restructure this into several breadth-first passes

over the object tree. It does not appear as though a heavy penalty is

paid for the additional passes. Here is an outline of passes.

RENAME-SCHEMAS

For each schema

For each rename clause

Connect the schema symbol to the real schema.

At this point, some renames and schemas are marked 'failed'.

Interestingly, rather than reading the dictionary to get

schema names, we use a FIFO, since schemas names can be

dynamically introduced while resolved USE/REFs when reading

other files.

RENAME-OBJECTS

For each schema

For each rename clause

Connect the final object to the rename

At this point, renames are marked 'rename_resolved'

and some are marked failed.

SUBSUPERS

foreach schema

foreach entity, type (including within functions, etc)

resolve sub/supertypes in types

resolve local types

RESOLVE-TYPES resolve type defs and entity attribute defs

foreach schema

resolve type definitions

foreach entity, alg

resolve attribute types (including LOCALs)

resolve proc/func parameter/return types

At this point, the only types not resolved are the control variables

in query types and repeats. In order to resolve them, you have to

do expression resolution. Fortunately, both can be done in an order

so that no forward references are required.

RESOLVE-INHERITANCE-COUNT (can be combined with RESOLVE-TYPES above)

requires: superclasses to be resolved to entities

foreach scope

foreach entity (e)

X: foreach superclass (sc)

if entity-inheritance(sc) is not calculated

X(sc)

e->inheritance += sc->inheritance

foreach scope, recurse

EXPRESSIONS-&-STATEMENTS

foreach schema

foreach scope (entity, alg)

resolve expression in query, repeat and therefore resolve

type of control in query, repeat

resolve derived attributes

resolve attribute initializers

do only entity attributes have initializers???

resolve statements (recurse)

foreach type

resolve where clause

======================================================================

Original code did not check for redefining keywords. Fixed.

======================================================================

USE and REFERENCE are handled by having separate lists and

dictionaries to remember schemas and individual objects that are USEd

and REFd. 'rename' structures are used to point to the remote object.

(This avoids the need for copying dictionaries, which enabled large

time/space savings.)

Once the rename has been processed, the rename points directly to the

final object, even if several schemas have USEd one another.

(The old USE/REF implementation did not detect recursive refs and

failed ungracefully in the presence of certain schema errors.

Dictionaries entries could not be removed while another part of the

code was traversing the dictionary.)

======================================================================

Enumerations are expressions which are entered into two scopes. One

scope is that of their own type definition. To adhere to the special

visibility rule placed on enumerations, they are also entered into the

immediately enclosing scope. In order to allow multiple enumeration

tags with the same name (but from different enumeration scopes), the

dictionary recognizes such overloads and marks such definitions as

"ambiguous" so that later retrievals fail with an appropriate message,

while other retrievals succeed.

Since the dictionary already knows object types, and this code is only

executed during conflicts, it is not expensive to have the dictionary

do this. However, it did require another dictionary routine

specifically for the purpose of adding enumerations to the enum-scope

to handle enumerations with the same name in the same type scope as a

real error.

======================================================================

Formal parameter tags are recorded but not analyzed, since it is

possible to do all type resolution without it. Oddly, tags are not

necessary, I suppose they could be useful for a run-time evaluator.

======================================================================

Implicit loop controls and ALIAS are handled by associating with them

a "tiny" scope of one element.

The function SCOPEget_nearest_enclosing_entity had to be invented to

extract the true referent of a SELF when you're inside of a tiny

scope.

======================================================================

Local variables are handled the same way at the schema level that they

are at the entity level or any other scope. I only mention this

because the the previous implementation did not support locals.

======================================================================

Classes of object types can be represented as bit strings (see

express_basic.h). This enables efficient handling of things like the

-p flag. More importantly, it can be helpful to give search functions

hints, such as when searching for a type (which normally includes

entities as well). For example, this provides a way of figuring out

the type when given the (legal) attribute declaration of:

A1: A1;

It is not sufficient to merely start searching at a superscope since

types can be defined within the current scopes. The important thing

is to ignore attributes. This and the business of allowing duplicate

enumerations are exceptions to the rule of only allowing one

definition with the same name in a single scope.

======================================================================

CONSTANTs are represented by attributes but with the flag.constant bit

on. Unlike normal attributes, these can be found in non-entity scopes.

======================================================================

Always code as if the person who will maintain your code is a

sadistic, psychopathic maniac who knows where you live.

- David Olsen

Writing documentation actually improves code. The reason is

that it is usually easier to clean up a crock than have to

explain it. - G. Steele.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

Changes

Latest commit

History

Changes

File metadata and controls