版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Lauren
GaoDesignAnalysisAfterSynthesisPart
IIWorkingwith
Timingget_timing_path:GetstimingpathobjectsthatmeetthespecifiedcriteriaCreatecustomreportingand
analysisreturnstimingpathobjectswhichcanbequeriedforproperties,orpassedtootherTclcommandsfor
processingreport_timing:performstiminganalysisonthespecifiedtimingpathsofthecurrentSynthesizedorImplemented
Designreturnsafileora
stringreport_timing_summary,report_exception,
reset_timingsetpaths[get_timing_paths-groupclk_tx_clk_core_1-max_paths100]report_timing-of_objects
$paths#Whichistheequivalent
of:report_timing-groupclk_tx_clk_core_1-max_paths
100get_timing_paths[-fromargs][-rise_fromargs][-fall_fromargs][-toargs][-rise_toargs][-fall_toargs][-throughargs][-rise_throughargs][-fall_throughargs][-delay_typearg][-setup][-hold][-max_pathsarg][-nworstarg][-unique_pins][-slack_lesser_thanarg][-slack_greater_thanarg][-groupargs][-no_report_unconstrained][-user_ignored][-sort_byarg][-filterarg][-regexp][-nocase][-match_stylearg][-quiet]
[-verbose]–-from/-to:ports,cells,pins,clock
object-through:pins,cells,
nets-delay_type:max==-setup-delay_type:min==
-hold-slack_lesser_than:showpathwithexpected
slack-max_path-nworst-unique_pinsget_timing_pathreport_timinghasthesimilaroptionswiththistclcommandTiming
PathTimingpathnetpinportsetmystart[get_cells{i_firctrl/raddrcoe_i_reg[1]}]setmyend[get_cells
{i_firctrl/raddrcoe_i_reg[3]}]setmypath[get_timing_path–from$mystart–to$myend
-setup]{i_firctrl/raddrcoe_i_reg[1]/C-->
i_firctrl/raddrcoe_i_reg[3]/D}setmynets[get_nets-of
$mypath]setmypins[get_pins-of
$mypath]PropertiesofTiming
PathfilterMulti-Corner
ConfigurationSlow:low
voltage,high
tempFast:lowtemp,high
voltagesetupcheckswill
failatslowprocesscorner,andholdchecksatfastDemoLauren
GaoUltraFast
DesignBasic
IntroductionDocumentedDesignMethodology
toImproveDesigner
ProductivityMethodology
RecommendationsBest
practicesChecklistCustomer
BenefitsFastertimetomarketBetterQoR&
runtimesLesstimewiththeirfavorite
FAEPage
2Whatis
UltraFast?Thesmarterway
for:PCB
planningHDL
CodingDesign
ClosureXDC
ConstraintsDesign
AnalysisTiming
ClosureforVivadoDesign
Suitev1.0CreatedbyFAEs/SAEsfromalloverthe
worldCollectionofbest
practicesThingsto
avoidKnowledgeisprovidedintheformofUG949,Checklistsand
scripts.“It’sgoodtolearnfromyour
mistakes”“It’smuchbettertolearnfrom
other
peopl e’s
mistakes!”Page
3UltraFastcollectionof
best-practicesDeviceDensityhasbeenincreasing
exponentiallyFPGAsareaslargeasASICswereafewyears
agoComplexityofdesignsareincreasing
significantlyFPGAsarethecenterofthesystem,notjust
“glue-logic”Properfrontendandbackendmethodologyisessentialforproject
successVivadoEnableseasyvalidationof
constraintsPowerfultiming
analysisFullfeaturedDRC
checksSuperiordesignanalysis
capabilitiesTclaccesstocompletedesign
databaseASIC-classFPGAtool,designedtohandletheselarge
FPGAsPage
4UltraFastMethodology:Why
Now?Upfront
analysisDesignclosureateach
stepPage
5OverallStrategyforUltraFAST
Design:Earlier
IterationsDevice/IPselectionImplementationClosureIPIntegration,RTL
Design,VerificationConfig.,Bring-up,Debug1.2x1.1xImpacton
QoR100x 10xPCB/PlanningReduceDesignCycleTime&
CostCustomer
AdvantageArchitectDevicePlanningDesign
CreationImplementationVerification/SimulationConfig/DebugArchitectDevice
PlanningDesignCreationImplementationVerification/SimulationConfig/DebugNormalDevelopment
CycleLongerdebug
cycleNon-deterministic
debugUltraFastDesignMethodologyDevelopment
CycleShorterdebug
cycleMoredeterministic
debugProcessReview
StepsUltraFast
ChecklistsUltraFastDesignMethodology
ChecklistXTP301
V2014.1Project
IntroductionBoardandDevicePlanningDesign
CreationImplementationConfigurationand
DebugLauren
GaoUltraFast
DesignClockingUseMMCMorPLL
ProperlyCreateanOutput
ClockClockResourceSelection
SummarySourceSynchronous
InterfaceClockingUseMMCMorPLLProperlyUg949>Ch4>Clocking>ControllingthePhase...WhileusingMMCMorPLL,payattentiontothe
followingDonotleaveanyinputs
floatingRSTshouldbeconnectedtotheuser
logicGroundingofRSTcancauseproblemsiftheclockis
interruptedLOCKEDoutputshouldbeusedintheimplementationof
resetSynchronouslogicclockedbytheclockcomingoutofthePLLshouldbeheldinresettillLOCKEDis
assertedTheLOCKEDsignalwouldneedtobesynchronized
beforegettingusedinasynchronousportionofthe
designTheneedforBUFGinthefeedbackpathisimportantonlyifthePLL/MMCMoutputclockneedstobephasealignedwiththeinputreference
clockConfirmtheconnectivitybetweenCLKFBINand
CLKFBOUTSafeClockStartupand
SequencingPg065>Ch4>CustomizingandGeneratingtheCoreSafeClock
StartupEnablestableandvalidclockattheoutputusingBUFGCEafterLockedis
sampledHighfor8input
clocksSequencingEnableClocksinasequenceaccordingtothenumberenteredthrough
GUIDelaybetweentwoenabledoutputclocksinsequenceis8cycleofsecond
clockinthesequenceclockItisusefulforasystemwheremodulesneedtobestartoperatingoneaftertheotherSafeClockStartupandSequencing
DemoSettingsonMMCMor
PLLUg949>Ch4>Clocking>Controllingthe
Phase...IncorrectsettingsontheMMCMorPLL
mayIncreaseclockuncertaintyduetoincreasedjitterBuildincorrectphase
relationshipsMaketimingmoredifficultClockuncertaintyintiming
analysisMMCM/PLLSettings,YourGoals:Poweror
JitterIfyouselect‘MinimizePower’,‘MinimizeOutputJitter’is
removed!Ug949>Ch4>Clocking>ControllingthePhase...Dependingonyourgoals,thesettingsintheClockingWizard
maybechanged
toFurtherminimizejitter,andthusimprovetimingatthecostofhigher
powerGotheotherwaytoreducepowerbutincreaseoutput
jitterMMCMBalancedMinimizeOutput
JitterPLLBalancedMinimizeOutput
JitterJitterComparisonBetweenDifferent
SettingsMMCM:MinimizeOutput
JitterPLL:MinimizeOutput
JitterMMCM:
BalancedPLL:
BalancedPhaseBetweenOutput
Clock146Same
phase146235Same
phase235CreatinganOutput
ClockODDRD1D2CECSRQCD1D2QUg949>Ch4>Clocking>CreatinganOutput
ClockAn
effective
way:
ODDR
can
forward
a
copy
of
the
clock
to
theoutputThisisusefulforpropagatingaclockandDDRdatawithidenticaldelays–TyingtheD1inputoftheODDRprimitiveHigh,andtheD2input
LowClockResourceSelectionSummary
1Ug949>Ch4>Clocking>ClockResourceSelection
SummaryBUFGUsewhenahigh-fanoutclockmustbeprovidedtoseveralclockregionsthroughoutthe
deviceUseforveryhighfanoutnon-clocknetssuchasaglobal
resetBUFGCEUsetostopalarge-fanoutseveral-regionclock
domainBUFGMUX/BUFGCTRLUsetochangeclockfrequenciesorclocksourcesduringtheoperationofyourdesignClockResourceSelectionSummary
2Ug949>Ch4>Clocking>ClockResourceSelection
SummaryBUFHUseforsmallerclockdomainsoflogicthatcanbecontainedwithinasingleclockregionBUFRUseforsmalltomediumsizedclocknetworksthatdonotrequireperformancehigherthan450
MHzBUFIOUseforexternallyprovidedhigh-speedI/Oclockinggenerallyinsourcesynchronousdata
captureBUFMRUsewhenyouneedtouseBUFRsorBUFIOsinmorethanonevertically
adjacentclockregionsforasingleclock
sourceClockResourceSelectionSummary
3Ug949>Ch4>Clocking>ClockResourceSelection
SummaryPLL and
MMCMPLLprovidesabettercontrolof
jitterMMCMcanprovideawiderrangeofoutput
frequencies.Fortightertimingrequirement,PLLsmightbebest,providedtheycanprovidethefrequencyof
interestIDELAY/
IODELAYUseonaninputclocktoaddsmallamountsofadditionalphaseoffset
(delay)Useoninputdatatoaddadditionaldelaytodatathuseffectivelyreducingclockphaseoffsetinrelationtothe
dataODDRUsetocreateanexternalforwardedclockfromthe
deviceSource-Synchronous
InterfaceISERDESFPGAFabricCCIOBUFIOBUFRIOCLKDATAN÷DrivingMultiple
BUFIOsAlthoughBUFRscanperformthisfunction,BUFIOssupplythehighestperformanceoperationanddrivededicatedclocknetswithintheI/O
columnTheplacersoftwareautomaticallyplacesthebuffersintheappropriate
locationUg107>Appx.A:Multi-Region
ClockingDrivingMultiple
BUFRsIfthedividevalueintheBUFRisbeingused,thenallBUFRinstancesmustberesetwhiletheBUFMRCEis
disabledTheplacersoftwareautomaticallyplacesthebuffersintheappropriate
locationUg107>Appx.A:Multi-Region
ClockingDrivingMultipleBUFRs(withDivide)and
BUFIOManuallyplacethebufferswithaLOC
ConstraintThelogicdrivenbythebuffersisautomaticallyplacedintheappropriate
locationUg107>Appx.A:Multi-Region
ClockingDrivingMultipleBUFRs(WithandWithout
Divide)ManuallyplacethebufferswithaLOC
ConstraintThelogicdrivenbythebuffersisautomaticallyplacedintheappropriate
locationUg107>Appx.A:Multi-Region
ClockingSynchronizingBUFRsDrivenbya
BUFMRThisresetsthedividersinthe
BUFRsDeasserttheCLRonallthe
BUFRsThisallowsthedividerstostartonthenextrisingedgetheinputclock(currently
gated)AsserttheCEonthe
BUFMRStartstheclockstoall
BUFRsBUFRsarenowin
syncCE÷÷÷CLRBUFMRCEBUFRUg107>Appx.A:Multi-Region
ClockingInordertoclockasingleinterfacethatspansmultiplebanks,a
BUFMRmustbeusedtodrivetheBUFIOandBUFRinthedifferentregionsThedividersoneachBUFRareindependent;theymustbesynchronizedinordertoensureproperoperationofthe
interfaceUseaBUFMRCEtodisabletheclockfeedingthe
BUFRsAsserttheCLRonallthe
BUFRsLauren
GaoRTLCoding
StylePart1Blockingstatementsvs.Non-blocking
statementsIncompletesensitivity
listLatch
inference–Anifstatementwithoutanelse
clauseBasic
Functionalityprocess(G,
D)beginif(G=‘1’)
thenQ<=
D;end
if;end
process;always@(Gor
D)if
(G)Q=
D;Anintendedregisterwithoutarisingedgeorfallingedge
constructWHY:moredifficulttiming
analysesIncompletereset
specificationtheresetsignalwillgethookedtotheCEpin,therebycreatinganotheruniquecontrol
setalways@(posedge
clk)if
(rst)reg1<=
1’b0;elsebeginreg1<=
din1;reg2<=
din2;endall_latchesSliceFlip-Flopsand
Flip-Flop/LatchesEachslicehasfourflip-flop/latches(FF/L)Canbeconfiguredaseitherflip-flops
orlatchesTheDinputcancomefromtheO6LUToutput,thecarrychain,thewidemultiplexer,ortheAX/BX/CX/DXslice
inputEachslicealsohasfourflip-flops
(FF)DinputcancomefromO5outputortheAX/BX/CX/DXinputThesedon’thaveaccesstothecarry
chain,widemultiplexers,ortheslice
inputsIfanyoftheFF/Lareconfiguredaslatches,thefourFFsarenotavailableLUT/RAM/SRLLUT/RAM/SRLLUT/RAM/SRLLUT/RAM/SRL0
1FF/LFFUseofLoopsinCodereg[3:0]dout;integeri;always@(posedgeclk)beginfor(i=0;i<=3;i=i+1)dout[3-i]<=
din[i];endProsand
ConsMinimizecoding
effortMayleadtoinefficientstructurestherebydegrading
performanceXilinxrecommendsrepresentingthesamefunctionalityusingconstructsthatareeasierforthetoolto
interpretTIPItisacceptabletoinferloopsforbasic
connectivitywhenthecodeinfershardwareresources(otherthanjustwires/interconnects),itisbettertoavoid
loopsalways@(posedge
clk)beginfor(i=0;i<=3;i=i+1)beginif(en[i])dout[i]<=
i;endendState-Machine
GuidanceMealyvs.Moore
StylesMain
difference:Mealy:Currentstate+Input=>
outputMoore:currentstate=>
outputIngeneral,MoorestatemachinesimplementbestinFPGA
devicesMostoftenone-hotstatemachinesisthechosenencoding
method,andthereislittledecodelogicnecessaryforoutput
valuesOne-Hotvs.Binary
EncodingThetwomostpopularforFPGAdesignsarebinaryand
one-hotVivado:
FSM_ENCODING"one_hot","sequential","johnson","gray","auto"and"none“,default:
“auto”(*fsm_encoding="one_hot"*)reg[7:0]
my_state;VHDLtypecount_stateis(zero,one,two,three,four,five,six);signalmy_state:
count_state;attributefsm_encoding:string;attributefsm_encodingofmy_state:signalis
"sequential";UseofDebug
LogicDebug
logicThelogicthatisnotnecessaryforthedesignfunction,butwhichisusefulinthedesign
analysisSeveralmethodscanassistinthis
objectiveGuardthelogicwitha`ifdef,parameter,orgenericthatcanbesettodisable
orenablethesesectionsof
codeCodethelogicinawaytomoreeasilyfacilitatecommentingitoutforthe
futureHaveaseparatedebugversionofamoduleorentitytointerchangefor
thispurposeTargetHaveagoodmethodologyfordebuggingthedesign
codeHaveagoodwaytoremovethat
logicDebug
logicDUTUser
logicAcontrolsetisthegroupingofcontrol
signalsset/resetclock
enableclockRegisterswithinasliceallsharecommoncontrol
signalsonlyregisterswithacommoncontrolsetmaybepackedintothesame
sliceDesignswithseveraluniquecontrol
setsHavealotofwasted
resourcesFeweroptionsforplacementresultinginhigherpowerandlower
performanceDesignswithfewercontrol
setsHavemoreoptionsandflexibilityintermsofplacement,generallyresulting
inimproved
resultsControlSignalsandControl
SetsControl
SetsAllflip-flopsandflip-flop/latchessharethe
sameCLK,SR,andCE
signalsThisisreferredtoasthe“controlset”ofthe
flip-flopsCEandSRareactive
highCLKcanbeinvertedattheslice
boundaryIfanyoneflip-flopusesaCE,allothersmust
usethesameCECEgatestheclockattheslice
boundarySaves
powerIfanyoneflip-flopusestheSR,allothersmust
usethesameSR–Theresetvalueusedforeachflip-flopisindividually
setbytheSRVAL
attributeDFF/LATCHD QCECKSRAFF/LATCHD QCECKSRD QCECKSRD QCECKSRAFFDFF●●
●●●
●report_control_setsIndicatorofpossiblepackingfragmentationandfitting
issuesRunthe–verbose
optiontogenerateafulllistControl
SetIfaninitialstateisnotspecified,itdefaultstoalogic
zeroItisnotnecessarytocodeaglobalresetforthesolepurposeofinitializingthe
deviceLimitstheoverallfanoutofthereset
netSimplifiesthetimingofthereset
pathsFunctionalsimulationshouldeasilyidentifywhetheraresetisneededornotNoresetbringsmuchgreaterflexibilityinselectingtheFPGA
resourcestomapthe
logicWhenandWheretoUsea
ResetDelay
lineSRLSRL+
RegistersAll
registersLUTorBlock
memoryWithoutresetWith
resetregisterswithacommon
resetUseActive-HighControl
SignalsFlip-FlopHierarchicaldesignmethodscanproliferateLUTusageonactive-lowcontrol
signalsTheinverters
cannotbe
combinedinto
thesame
sliceThisconsumesmorepowerandmakestimingdifficultControlaLocalizedReset
NetworkclkD QD Q DQD Qrst_nSynchronous
resetHigheffectiveLocal
resetAsynchronous
setLoweffectiveSynchronous
BridgeThenumberofflip-flopsinthechaindeterminesthe
minimumdurationoftheresetpulseissuedtothelocalized
networkControlaLocalizedReset
NetworkVerilogalways@(posedgeclkornegedgerst_n)//async.Negedgeresetbeginif(!rst_n)synchronizer_ckt<=4’hf//4stagereset
syncornizationelsesynchronizer_ckt<={synchornizer_ckt[2:0],
1’b0};endassignsynchronized_rst_n=
~synchronizer_ckt[3];//thefinalresetsignalwhichisusedtoresetthe
actual//flopsinthe
designUg949:UltraFastDesignMethodologyGuideforthe
VivadoDesignSuite,chapter
4Wp272:GetSmartAboutReset:ThinkLocal,Not
GlobalMore
InfoLauren
GaoRTLCoding
StylePart2Forlargerthan4-bitaddition,subtractionand
add-subCarrychain+oneLUTper2-bit
addition8-bit+8-bitadder:8LUTs+associatedcarry
chainTernaryadditionandwithouttheuseofaregisterin
betweenOneLUTper3-bit
addition8-bit+8-bit+8-bitadder:8LUTs+associatedcarry
chainIngeneral,multiplicationistargetedtoDSP
blocksThreelevelsofpipeliningarounditgeneratesbestsetup,clock-to-out,andpowercharacteristicsKnowWhatYou
InferShiftregistersordelaylinesthatdonotrequireresetormultiple
tappointsaregenerallymappedintoShiftRegisterLUTsor
SRLsTobestutilizeSRLs,avoidusingresetforthose
blocksIn7-seriesFPGA,eachLUTcandelayserialdatafrom1to32clock
cyclesForconditionalcoderesultinginstandardMUX
components4-to-1MUX:1LUT,onelogic
level8-to-1MUX:2LUTs+1MUXF7,onelogiclevel16-to-1MUX:4LUTS+1MUXF7+1MUXF8,onelogic
levelKnowWhatYou
InferUsingDedicatedBlocksorDistributedRAMsUsingtheOutputPipelineRegisterSelectingtheProperBlockRAMWrite
ModePerformanceConsiderationsWhenImplementing
RAMUsingDedicatedBlocksorDistributed
RAMsCLB_LLRAMsmaybeimplementedin
eitherthededicatedblock
RAMWithinLUTsusingdistributed
RAMTheFirstChoiceCriterion:Required
DepthMemoryarraysdeeperthan256aregenerallyimplementedinBlock
memorySlice_LSlice_LCLB_LMSlice_L_MSliceEachblockRAMblockcanbeused
asor36KbBRAM/FIFO18KbBRAM18KbBRAM/FIFOUsinganoutputregisterisrequiredforhighperformance
designsItisrecommendedforalldesignsThisimprovestheclocktooutputtimingof
theblock
RAMHavingbothregistershasatotalread
latencyof3DetermineearlywhetheranextraclockcycleoflatencyduringreadsistolerableUsingasynchronousresetimpactsRAMinference,andshouldbe
avoidedUsingtheOutputPipeline
RegisterBlockRAMD QD QRegisteroutofmemoryprimitivesRegisteroutofmemorycoreXilinxrecommendsthefollowingguidelinesforselectingthebest
writemodeforaparticular
operationConsiderFunctionality
FirstIfyoumustseethepriorvalueintheblockRAMduringwrite,select
READ_FIRSTIfyouwanttoreadthenewdatabeingwrittentotheblockRAMuse
WRITE_FIRSTIfyoudonotcareaboutthedatareadduringwrites,thenthenextselectioncriteriahastodowithmemory
collisionsUseNO_CHANGE
ModeInallothercases,XilinxrecommendsNO_CHANGEmode.NO_CHANGEhasthebestpower
characteristicsSelectingtheProperBlockRAMWrite
ModeREAD_FIRSTWRITE_FIRSTNO_CHANGEDSPSlice
FeaturesMULTZ-1ADDZ-1Z-1Z-236OpMode748A:B48072Y
36
0X017-bitshift17-bit
shiftA2518M
REGCED QPREGCED QB48DALUModeCarryInZC
REGCED Q14=Cor
MC48CEAREGD Q2-DeepB
REGCED Q2-DeepPPATTERNDETECTCInput
ConditioningOPCTLTheDSPblockscanperformmanydifferent
functionMultiplication,Additionandsubtraction,Comparators,Counters,General
logicFullypipelinethecodeintendedtomapintothe
DSP48DSP48E1sliceregisterscontainonlyresets,andnot
setsAvoidasynchronousresets,sincetheDSPsliceonlysupportssynchronousresetoperationsTheDSP48E1blocksuseasignedarithmetic
implementationCodeusingsignedvaluesintheHDLsourcetobestmatchtheresourcecapabilitiesThebitprecisionforsigneddatais18bitsby25
bitsThebitprecisionforunsigneddatais17bitsby24
bitsForVerilogcode,dataisconsideredunsignedunlessotherwisedeclaredinthe
codeCodingforP
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 外科医生工作总结报告
- 药剂师在合理用药宣传教育工作总结
- 下体弹力护身体育用品产品入市调查研究报告
- 与计算机连用的存储器商业机会挖掘与战略布局策略研究报告
- 牙膏监督管理办法
- 2023-2024学年福建省福州七中高一下数学期末质量跟踪监视模拟试题含解析
- 2024年中考考前化学集训试卷17及参考答案(含答题卡)
- 抛光用巾产品相关项目实施方案
- 灌木修剪机项目成效分析报告
- 治疗呼吸系统疾病和症状用医疗器械和仪器产品相关项目实施方案
- 2023年小升初语文阅读理解:端午日(附答案解析)
- Can you hear me合唱钢琴伴奏谱
- 物业管理前期介入工程篇
- 某公司发展战略研究报告
- 社工初级2020年综合能力考试真题(含答案)
- 期末考试高效复习中小学生期末考动员主题班会PPT教学课件
- C30水泥混凝土路面施工方案方案
- 环境管理物质管控表
- 电机选型计算和涡轮蜗杆传动选型计算
- 上海市一模二模或中考数学答题纸
- 幼儿园绘本故事:《我不知道我是谁》
评论
0/150
提交评论